-- Which file executes the SQL Queries in YIOOP and does YIOOP index all the languages of the WIKIPEDIA Pages ???
                
Hi,
-  You can index whatever pages you want by listing them as seed sites for the crawl you want to carry out.
-  Yioop does not use a sql database to store crawls. It can use sqlite, mysql, or postgres for storing group wiki information and non crawl related stuff. Crawls are stored in work_directory/cache as folders beginning with IndexData followed by some timestamp. These folders have three sub-folders: dictionaries, which contains a collection of binary files that let you look up based on the hash of a query term, which index shards have postings about that query term; posting_doc_shards, which has a collection of index shard files each of which is a binary file containing postings which represent where to find for an occurrence of a query term, the document that contained that occurrence; and summaries, which contains a collection of web archive files with compressed summaries of each web page downloaded.
 
In Version 5 of Yioop, also in the work-directory/cache folder, you will see Archive folders, these contain compressed full pages that were downloaded during a crawl.
To understand how the indexing and crawl process work you should read:
Yioop Ranking.
If you want to find out where the SQL database used by Yioop for Groups and Wiki's is and what it contains go to Server Settings and look at how it is configured under:
Database Set-Up. Typically, a sqlite database is used and it is stored in work_directory/data/public_data.db . The contents can be viewed using any Sqlite viewer.
(
Edited: 2018-07-04)                
 
                                    Hi,
Most of the questions you are asking are answered in the [[https://www.seekquarry.com/p/Documentation|Documentation for Yioop]] if you do a search in that page.
# You can index whatever pages you want by listing them as seed sites for the crawl you want to carry out.
# Yioop does not use a sql database to store crawls. It can use sqlite, mysql, or postgres for storing group wiki information and non crawl related stuff. Crawls are stored in work_directory/cache as folders beginning with IndexData followed by some timestamp. These folders have three sub-folders: dictionaries, which contains a collection of binary files that let you look up based on the hash of a query term, which index shards have postings about that query term; posting_doc_shards, which has a collection of index shard files each of which is a binary file containing postings which represent where to find for an occurrence of a query term, the document that contained that occurrence; and summaries, which contains a collection of web archive files with compressed summaries of each web page downloaded.
In Version 5 of Yioop, also in the work-directory/cache folder, you will see Archive folders, these contain compressed full pages that were downloaded during a crawl.
To understand how the indexing and crawl process work you should read:
[[https://www.seekquarry.com/p/Ranking|Yioop Ranking]].
If you want to find out where the SQL database used by Yioop for Groups and Wiki's is and what it contains go to Server Settings and look at how it is configured under:
Database Set-Up. Typically, a sqlite database is used and it is stored in work_directory/data/public_data.db . The contents can be viewed using any Sqlite viewer.