[ Prev ]
2016-09-16

-- Crawler Halting
Thanks Chris... please see the video - I made a change in crawler options and also in appearance settings and both have same effect. I'm scratching my head on this :)
Thanks, Paul
Thanks Chris... please see the video - I made a change in crawler options and also in appearance settings and both have same effect. I'm scratching my head on this :) Thanks, Paul ((resource:yioop.mov|Resource Description for yioop.mov))

-- Crawler Halting
Okay. What I learned from your video is that how you encountered the error, was how I thought you were encountering it. You didn't take a full browser window capture, so I couldn't see the urls as you were running. Between version 3.4 and the present, I have changed the cross-site request forgery (CSRF) prevention token from a format with a vertical bar | to one with a *. I seem to remember this had something to do with ensuring things were spec meeting urls. This seems to work in all browsers I am using, Firefox, Safari, Chrome, IE, Edge, but maybe there is something configured on your machine that is messing up the token either on the server or client side? The code itself is in the file src/controllers/Controller.php in the two methods generateCSRFToken and checkCSRFToken.
Best,
Chris
(Edited: 2016-09-16)
Okay. What I learned from your video is that how you encountered the error, was how I thought you were encountering it. You didn't take a full browser window capture, so I couldn't see the urls as you were running. Between version 3.4 and the present, I have changed the cross-site request forgery (CSRF) prevention token from a format with a vertical bar | to one with a *. I seem to remember this had something to do with ensuring things were spec meeting urls. This seems to work in all browsers I am using, Firefox, Safari, Chrome, IE, Edge, but maybe there is something configured on your machine that is messing up the token either on the server or client side? The code itself is in the file src/controllers/Controller.php in the two methods generateCSRFToken and checkCSRFToken. Best, Chris
2016-09-17

-- Crawler Halting
Thank you again for your continued assistance Chris...
So i think I can rule out client side as I tried from several Internet connections (was travelling all week) and just tried via Windows 10 machine with Edge and IE from home.
In an effort to determine where this is occurring, I built a new VM on DigitalOcean this morning using Ubuntu 16.04 LTS and did the following for reference:
 apt-get install curl
 apt-get install apache2
 apt-get install php
 apt-get install php-cli
 apt-get install sqlite3 libsqlite3-dev
 apt-get install php-curl
 apt-get install php-gd
 apt-get install libapache2-mod-php7.0
 apt-get install php-mbstring
 apt-get install php-sqlite3
 a2enmod rewrite
 service apache2 restart
The /var/www/html directory looks like this:
root@spider-test:/var/www/html# ls -a -l
 total 92
 drwxr-xr-x  5 root     root      4096 Sep 17 16:43 .
 drwxr-xr-x  3 root     root      4096 Sep 17 16:39 ..
 -rw-r--r--  1 www-data www-data   892 Sep 17 16:38 composer.json
 -rw-r--r--  1 www-data www-data   924 Sep 17 16:38 composer.lock
 -rw-r--r--  1 www-data www-data   665 Sep 17 16:38 .htaccess
 -rw-r--r--  1 www-data www-data 11321 Sep 17 16:39 index.html
 -rw-r--r--  1 www-data www-data  1492 Sep 17 16:38 index.php
 -rwxr-xr-x  1 www-data www-data  1940 Sep 17 16:38 INSTALL
 -rwxr-xr-x  1 www-data www-data 36068 Sep 17 16:38 LICENSE
 -rwxr-xr-x  1 www-data www-data  2555 Sep 17 16:38 README
 drwxr-xr-x 14 www-data www-data  4096 Sep 17 16:38 src
 drwxr-xr-x  3 www-data www-data  4096 Sep 17 16:38 tests
 drwxr-xr-x  2 www-data www-data  4096 Sep 17 16:38 work_directory
I modified apache default host configuration file to allow overrides so .htaccess would function:
 root@spider-test:/etc/apache2/sites-available# more 000-default.conf 
 <Directory /var/www/html>
        Options Indexes FollowSymLinks
        AllowOverride All
 </Directory> 
….
Without changing any parameters I went into Manage Crawls and made a minor change. The URL was: x.x.x.x/admin?c=admin&a=manageCrawls&arg=options&YIOOP_TOKEN=ibQg8wDgGek*1474131722
When I submitted the change I got kicked back to the login screen again, but interestingly the URL was exactly the same: x.x.x.x/admin?c=admin&a=manageCrawls&arg=options&YIOOP_TOKEN=ibQg8wDgGek*1474131722
Looking for ideas as I continue to mess around with this.... don't believe it's permissions issue... maybe sqlite issue or token issue?
Thanks, Paul
(Edited: 2016-09-17)
Thank you again for your continued assistance Chris... So i think I can rule out client side as I tried from several Internet connections (was travelling all week) and just tried via Windows 10 machine with Edge and IE from home. In an effort to determine where this is occurring, I built a new VM on DigitalOcean this morning using Ubuntu 16.04 LTS and did the following for reference: apt-get install curl apt-get install apache2 apt-get install php apt-get install php-cli apt-get install sqlite3 libsqlite3-dev apt-get install php-curl apt-get install php-gd apt-get install libapache2-mod-php7.0 apt-get install php-mbstring apt-get install php-sqlite3 a2enmod rewrite service apache2 restart The /var/www/html directory looks like this: root@spider-test:/var/www/html# ls -a -l total 92 drwxr-xr-x 5 root root 4096 Sep 17 16:43 . drwxr-xr-x 3 root root 4096 Sep 17 16:39 .. -rw-r--r-- 1 www-data www-data 892 Sep 17 16:38 composer.json -rw-r--r-- 1 www-data www-data 924 Sep 17 16:38 composer.lock -rw-r--r-- 1 www-data www-data 665 Sep 17 16:38 .htaccess -rw-r--r-- 1 www-data www-data 11321 Sep 17 16:39 index.html -rw-r--r-- 1 www-data www-data 1492 Sep 17 16:38 index.php -rwxr-xr-x 1 www-data www-data 1940 Sep 17 16:38 INSTALL -rwxr-xr-x 1 www-data www-data 36068 Sep 17 16:38 LICENSE -rwxr-xr-x 1 www-data www-data 2555 Sep 17 16:38 README drwxr-xr-x 14 www-data www-data 4096 Sep 17 16:38 src drwxr-xr-x 3 www-data www-data 4096 Sep 17 16:38 tests drwxr-xr-x 2 www-data www-data 4096 Sep 17 16:38 work_directory I modified apache default host configuration file to allow overrides so .htaccess would function: root@spider-test:/etc/apache2/sites-available# more 000-default.conf <Directory /var/www/html> Options Indexes FollowSymLinks AllowOverride All </Directory> …. Without changing any parameters I went into Manage Crawls and made a minor change. The URL was: x.x.x.x/admin?c=admin&a=manageCrawls&arg=options&YIOOP_TOKEN=ibQg8wDgGek*1474131722 When I submitted the change I got kicked back to the login screen again, but interestingly the URL was exactly the same: x.x.x.x/admin?c=admin&a=manageCrawls&arg=options&YIOOP_TOKEN=ibQg8wDgGek*1474131722 Looking for ideas as I continue to mess around with this.... don't believe it's permissions issue... maybe sqlite issue or token issue? Thanks, Paul

-- Crawler Halting
Apologies for previous cut/paste job .. kinda messy.
I rebuilt another box and installed 3.3.0 on it to see if I could reproduce the issue. Under 3.3.0 I do not encounter the issue with updating crawler options.. I do though still have the issue with starting a crawl and it seems to stall.. hoping this helps with any suggestions you have. Thanks!
(Edited: 2016-09-17)
Apologies for previous cut/paste job .. kinda messy. I rebuilt another box and installed 3.3.0 on it to see if I could reproduce the issue. Under 3.3.0 I do not encounter the issue with updating crawler options.. I do though still have the issue with starting a crawl and it seems to stall.. hoping this helps with any suggestions you have. Thanks!

-- Crawler Halting
Hey Paul,
Can you take the code from the earlier version that works for you of the two methods generateCSRFToken and checkCSRFToken in src/controllers/Controller.php and replace the newer version with them? Does that get it to work?
Best, Chris
Hey Paul, Can you take the code from the earlier version that works for you of the two methods generateCSRFToken and checkCSRFToken in src/controllers/Controller.php and replace the newer version with them? Does that get it to work? Best, Chris

-- Crawler Halting
Thanks ... just tried that and no luck either... don't believe it's token related.

root@spider01:/var/www/html/src/controllers# diff Controller.php Controller.php.3.4.295-gba1631c 520c520 < return L\crawlHash($user.$time . C\AUTH_KEY)."|$time"; --- > return L\crawlHash($user.$time . C\AUTH_KEY)."*$time"; 530c530 < public function checkCSRFToken($token_name, $user) --- > public function checkCSRFToken($token_name, $user) 535c535 < $token_parts = explode("|", $_REQUEST[$token_name]); --- > $token_parts = explode("*", $_REQUEST[$token_name]); 538c538 < L\crawlHash($user.$token_parts[1] . C\AUTH_KEY) == --- > L\crawlHash($user . $token_parts[1] . C\AUTH_KEY) == 927c927 < } --- > }
If I may ask, is the production code running on yioop.com for example different or based on the 3.2.x code branch? Thank you....
Thanks ... just tried that and no luck either... don't believe it's token related. ---- root@spider01:/var/www/html/src/controllers# diff Controller.php Controller.php.3.4.295-gba1631c 520c520 < return L\crawlHash($user.$time . C\AUTH_KEY)."|$time"; --- > return L\crawlHash($user.$time . C\AUTH_KEY)."*$time"; 530c530 < public function checkCSRFToken($token_name, $user) --- > public function checkCSRFToken($token_name, $user) 535c535 < $token_parts = explode("|", $_REQUEST[$token_name]); --- > $token_parts = explode("*", $_REQUEST[$token_name]); 538c538 < L\crawlHash($user.$token_parts[1] . C\AUTH_KEY) == --- > L\crawlHash($user . $token_parts[1] . C\AUTH_KEY) == 927c927 < } --- > } If I may ask, is the production code running on yioop.com for example different or based on the 3.2.x code branch? Thank you....

-- Crawler Halting
I don't suppose there is a way you could email me at chris@pollett.org with info to ssh into one of your boxes to try figure out what's going wrong? If I can fix it I'll post the fix here. You can change the passwords and stuff immediately after I'm done.
Best,
Chris
I don't suppose there is a way you could email me at chris@pollett.org with info to ssh into one of your boxes to try figure out what's going wrong? If I can fix it I'll post the fix here. You can change the passwords and stuff immediately after I'm done. Best, Chris
2016-09-18

-- Crawler Halting
Hi Paul,
When I went in and looked at your version of Yioop, it looked like you had modified the downloaded code in each of the elements to add a urlencode function on the hidden form variables for the CSRF_TOKEN. I moved your code into /var/www/html-old and put a fresh version of the code into /var/www/html. If you look at a file like src/views/elements/CrawloptionsElement.php we see changes like:
 <             C\CSRF_TOKEN."=".$data[C\CSRF_TOKEN] ?>"  ><?=
 ---
 >             C\CSRF_TOKEN."=".urlencode($data[C\CSRF_TOKEN]) ?>"  ><?=
I looked at past versions of my code and I never had the urlencode function. This was causing the tokens to break.
Best,
Chris
(Edited: 2016-09-18)
Hi Paul, When I went in and looked at your version of Yioop, it looked like you had modified the downloaded code in each of the elements to add a urlencode function on the hidden form variables for the CSRF_TOKEN. I moved your code into /var/www/html-old and put a fresh version of the code into /var/www/html. If you look at a file like src/views/elements/CrawloptionsElement.php we see changes like: < C\CSRF_TOKEN."=".$data[C\CSRF_TOKEN] ?>" ><?= --- > C\CSRF_TOKEN."=".urlencode($data[C\CSRF_TOKEN]) ?>" ><?= I looked at past versions of my code and I never had the urlencode function. This was causing the tokens to break. Best, Chris

-- Crawler Halting
Chris .. thank you so much of taking the time to fix this! I honestly am completely at a loss where the urlencode entry came from as it wasn't anything I did to the code. I still have the original ZIP file from yioop-v3.4.0-295-gba1631c that I downloaded via GIT. I just unzipped it again and the same urlencode option is in there... completely at a loss.
I did it just now using git clone and it's not there.... I know this sounds crazy but something it appears is adding this to the code and have no clue what it could be.
Regardless, I'm back on track - sorry to have taken up your time on something like this .. I was very stumped and again appreciate your help!
Thanks, Paul
Chris .. thank you so much of taking the time to fix this! I honestly am completely at a loss where the urlencode entry came from as it wasn't anything I did to the code. I still have the original ZIP file from yioop-v3.4.0-295-gba1631c that I downloaded via GIT. I just unzipped it again and the same urlencode option is in there... completely at a loss. I did it just now using git clone and it's not there.... I know this sounds crazy but something it appears is adding this to the code and have no clue what it could be. Regardless, I'm back on track - sorry to have taken up your time on something like this .. I was very stumped and again appreciate your help! Thanks, Paul

-- Crawler Halting
Hi Paul,
I agree the above is kinda weird. So I was trying to understand how it could happen. I just double-checked to make sure that I never had urlencode on that item by doing
git log CrawloptionsElement.php
That file has been changing pretty slowly, so I only had to check four or five git diff's to confirm that I did not use that urlencode function at least as far back as July 2015. I also double checked downloading through the download form and through viewgit links if something strange was going on. The downloads still didn't have the urlencode function. Oh well. At least it's working for you now.
Best, Chris
Hi Paul, I agree the above is kinda weird. So I was trying to understand how it could happen. I just double-checked to make sure that I never had urlencode on that item by doing git log CrawloptionsElement.php That file has been changing pretty slowly, so I only had to check four or five git diff's to confirm that I did not use that urlencode function at least as far back as July 2015. I also double checked downloading through the download form and through viewgit links if something strange was going on. The downloads still didn't have the urlencode function. Oh well. At least it's working for you now. Best, Chris
[ Next ]
X