Home
UsernamePassword
This group is the work environment for those interested in the Nodes Portal Toolkit (NPT)

Share |
Group discussion > NPT Startup pre-release

NPT Startup pre-release

Burke Chih-Jen Ko
1424 days ago

Dear all,

We have made a pre-release version of NPT Startup. Please refer to the github repository if you want to evaluate the progress.

There are further refinements in the queue. We'll keep you updated via this group and the repository.

Many thanks for your support!

Best wishes,

Burke

Deden Sumirat Hidayat
1413 days ago

Thanks for the source, but there is a problem during the installation process, its looks like this : http://farm3.staticflickr.com/2818/10099745403_037ca33dc4_h.jpg .. please help Smile

Burke Chih-Jen Ko
1413 days ago

Hi Deden,

Thanks for trying. It looks like some MySQL configuration can be done to avoid that. Could you please see if the documentation works for you?

Many thanks,

Burke

Deden Sumirat Hidayat
1413 days ago

thanks for the advice, the first problem can be solved, but another problem arises, its look like this:

image

strangely when I stop the web server application, then I start again, everything is normal:

image

, is this normal or is there actually an invisible problem?

sorry for always disturb your time

many thanks,

d2n

Burke Chih-Jen Ko
1412 days ago

Hi Deden,

Not a disturb at all!

Not sure if you use the latest version of the code? Please download the beta version again and make sure all code is readable by the process of the web server.

As your second screenshot shows, it looks the installation is not completed because it's not showing the NPT Startup theme. So please don't stop the installation. I understand the pause between steps is a bit confusing so I'll see if I can either give a hint or shorten the time of the step.

Let me know if this helps.

Thanks,

Burke

Reuben Roberts
1375 days ago

Hi there.

Just trying to set up the beta, and I've run into a few issues.  Firstly, installation using PostgreSQL did not work: at some point in the long process of installing the 100+ modules it crashed - I deleted the site and started again using a MySQL database and it worked ok.  During the configuration stage however, the script exceeded maximum execution time of '60 seconds' (although my PHP is configured to allow 300 seconds); rerunning this stage again produced an error of exceeding execution time of '240 seconds' (I don't know where these values are coming from - I can't find anything like this in my Apache or MySQL configurations), although it did complete the setup.  I don't know why, but performance is very sluggish.  I do a lot of web development and something is wrong here: I'm working on a reasonable development machine and page loads should not be taking 30s+.  Do you have any other details on PHP / MySQL / Apache configuration that could sort these issues out?  My test site has the following menu items "Home
    News
    Events
    Synopsis of Biodiversity
    Synopsis of Biodiversity
    Facts
    About us
    About us
    Biographies" so you can see that things got hosed-up.  I'd love to install this correctly.

Thanks!

Reuben Roberts
1375 days ago

Hi again.

 

When importing specimen/observation data, I see the template contains the following fields:

GUID, Basis of record, Catalogue number, Collection code, Collector (UID), Collector (Name on site), Collector number, Count, Date collected (Start), Date collected (End), Date identified (Start), Date identified (End), GenBank number(s), Identification qualifier, Identified by (UID), Identified by (Name on site), Institution code, Lifestage, Location (NID), Location (Title), Field notes, Field number, Other catalogue numbers, Remarks, Sex, Taxonomic name (Name), Taxonomic name (TID), Taxonomic name (GUID), Type status

There is nothing for spatial data (latitude / longitude etc), except for a Location ID (and I can't see a way to load cordinates using the Location data template either!).  Can these columns be added?  In general, I would expect to be able to load data as per the DarwinCore, not an arbitrarily selected subset of fields.  Thanks!

Burke Chih-Jen Ko
1375 days ago

Hi Reuben,

 

First of all, thanks for installing and reporting issues. There are several questions and I try to answer below:

 

1. Compatibility with PostgreSQL: Although Drupal can be installed on PostgreSQL, NPT Startup, together with many Scratchpads modules, are not fully tested on PostgreSQL yet. I am aware some modules have queries not coded against PDO, and some others trying to access geospatial features on MySQL. It is probably the reason why the installation didn't go through. We probably couldn't address this anytime soon as for the targeted user group MySQL should do the job pretty well.

 

2. on Timeout: I am attaching the configuration of my MySQL and PHP (my.cnf[1] and php.ini[2] respectively) for your reference. You could first try adjust the max_execution_time and memory_limit in php.ini. max_allowed_packet might be set very low in some MySQL settings, I have 64M. You may want to take a look at this page[3] and suggest anything we can learn from your experience.

 

3. on Performance: On my machine (Core i7 2.6Hz, 8G RAM), with the attached configuration, it takes 5 mins to install all 197 modules, and 14 mins to import all 6800+ names for GBIF Benin. I am not sure by 'page load' which step you mean, but 30+ seconds is definitely not normal and I'd like with work with you to find out where it goes wrong.

 

4. on Importing Location:

1) To batch import location information particularly in DwC-A format, you probably want to take a look at the DwC-A importing module and library developed by INBio[4][5]. It digest a DwC-A zip file and divert the content into various content types defined in NPT Startup, which inherits Scratchpads for biodiversity content management. To my knowledge the DwC-A importing automates two-pass process to first analyse and import location, and then specimen/observation.

2) To import location using the accompanying template, "Map" is where you put the geospatial information as text. There are help text in the latest version of Scratchpads but are not yet merged into NPT Startup:

 

Please enter data in the following format, entering multiple values on a single line (this is easier to do in a text editor, and then paste into Excel):

 

{TYPE OF DATA}:{DATA}

REGION:{TDWG region code}

POINT:{latitude},{longitude}

POLYGON:{Well known text}

POLYLINE:{Well known text}

RECTANGLE:{Well known text}

 

e.g.

 

REGION:SPA-SP

REGION:FRA-FR

REGION:GER-OO

REGION:1

POINT:56.802292017627,-3.1201171875

POINT:53.2798967926641,3.0322265625

POLYGON:POLYGON ((-33.0029296875 60.37170546875135,-20.5224609375 60.84616839706054,-18.6767578125 59.22225529448783,-21.4892578125 55.926032385960966,-29.4873046875 55.579804150399035,-35.4638671875 57.08991816945666,-35.9912109375 59.22225529448783))

 

The NPT Startup is targeting those seeking a web presence and utilises as much information online as possible through web services - effectively it is a mash up of information.  It is not targeting those who are looking to manage large volumes of (e.g.) occurrence content for which it has not been tested, although could be developed to support that in the future.

 

Please let me know if these information helps you setting up, and I would like to learn more from you about how this tool should be improved.

 

Cheers,

 

Burke

 

[1] https://dl.dropboxusercontent.com/u/66568061/configuration/my.cnf.bko

[2] https://dl.dropboxusercontent.com/u/66568061/configuration/php.ini.bko

[3] https://github.com/gbif/gbif-npt-startup/wiki/Installation

[4] https://github.com/burkeker/npt-inbio

[5] https://github.com/burkeker/DwC-A-PHP-Library

Reuben Roberts
1374 days ago

Hi Burke

Thanks for getting back to me so quickly, much appreciated!

1. On the compatibility with PostgreSQL - that's fine, maybe remove the installation option for PostgreSQL until it is implemented?

 

2. On timeout: Here's the content of my PHP error log:

[11-Nov-2013 20:32:55] PHP Fatal error:  Maximum execution time of 60 seconds exceeded in C:\Inetpub\wwwroot\gbif-npt-startup-1.0.0-beta\includes\module.inc on line 329

[12-Nov-2013 08:27:01] PHP Fatal error:  Maximum execution time of 240 seconds exceeded in C:\Inetpub\wwwroot\gbif-npt-startup-1.0.0-beta\includes\database\database.inc on line 2168

Which is strange, because my PHP config contains 'max_execution_time = 1200'.  The only variable in my PHP config with is more restrictive than yours is 'mysql.connect_timeout = 60' (as opposed to your value of 180).  I'll tweak that and rerun the installation and let you know how that works.

For MySQL its a little harder to compare, but the only things that stand out for me are max_connections=100 (as opposed to your 300), table_cache=256 (as opposed to 768) and tmp_table_size=369M (as opposed to your tmp_table_size = 524288000).  The DB engine is tuned for querying large datasets as opposed to updating.  I'll try the PHP config changes first, though.

3. Performance.  I'm running with dual 2.4GHz processor, 4GB RAM, 25GB free disk space on C: - note that its a Windows XP x64 O/S.  Maybe there's some Apache config issue.  Also, I'm on a 1MB internet connection - which is not great, but probably better than some potential users :)

4. On loading data. Well, I suppose this can be discussed at some length, maybe on a separate thread.  My experience has been that getting users comfortable with the format and process required to load data using the IPT has been enough of a learning curve, so there will definitely be push-back on a new protocol.  Secondly, the coordinates embedded in the kind of occurrence data I have been exposed to (southern Africa biodiversity) should not be promoted to 'places' in their own right, at the same level as e.g. a named place.  They are coordinates, they are often imprecise, and even records with the same coordinates are not necessarily referring to the same locality (due to imprecision, error, etc.).  Lastly, the TDWG region codes are not very useful.  They don't mesh well with ISO codes, and for southern Africa in particular they are quite strange (obsolete provinces, missing countries, I can dig up the details if you're interested).  Maybe they are working to fix some of these issues...

Cheers

Reuben

Burke Chih-Jen Ko
1374 days ago

Thanks Reuben, let's get you through the installation first.

2. -> Are you sure your PHP is picking up the configuration from where you expect? You may want to double check that by <?php phpinfo(); ?> and see if they are identical with the php.ini?

1. -> About PostgreSQL as an option for database, I've put an issue here.

3. -> With slow connection, the installation might take longer because it needs to call external web services.

4. I agree we should start another thread on this.

Let me know if you have any luck of installation with new configuration.

Cheers,

Burke

 

Reuben Roberts
1374 days ago

Ok, this time it installed on the first try (with the PHP config change), although it took about 2 hours, with the httpd process running at full throttle the whole time.  I might set up a separate instance and try tweaking the MySQL configuration to see if that has any effect, although the other database-driven sites I run on the machine perform ok.

Once installed, I went to import a taxon list (seemed a reasonable thing to do).  I get:

Completed 0 of 2.
Downloaded 177 names of approximately 689159
Approximate time remaining: 1 week 4 days
This for the list of plant names from EOL.  At the same time I did a speed text to the US and get about 0.7Mbps.  The download via what I assume is a webservice is unworkably slow: downloading the EOL dataset directly shouldn't take more than an hour, surely?

Thanks for any comment on this.

Cheers

Reuben

 

 

Reuben Roberts
1373 days ago

Hi guys

Thanks for the comments - I only saw them after I made my last post.

As I said, the installation is complete now, despite it being very slow.  I only changed the PHP timeout value.  If I have a chance I will see if I can improve performance, but I'm hesitant to change the inodb_flush_log_at_trx_commit configuration for MySQL: there are other sites using MySQL, not just the portal!

What I want to accomplish is to get my test site looking like the public demo site, i.e. I want to load some occurrence records (not many, maybe only 100 or so), and configure the site to display the GoogleMap with these overlayed on it.

I have used the Excel templates to create some locations and specimen/occurrence records.  But I cannot easily see how to (a) assign coordinates to places or (b) how to configure the interface to display the map (or even simply list the data as imported).  Burke, I know you mentioned the InBio tools and I will get to that; at the moment I am trying to simply use the portal, since I would assume the portal offers the functionality - at least as far as achieving the same as the public demo site.

Maybe I have the wrong preconception of what the portal is intended to provide, or there is some bigger picture I am unaware of?

Thanks for any guidance on this.

R