========= Array Structure for getPages() ===========
It is an array of arrays so you can download multiple web pages at the same time. After download
a subarray [CrawlConstants::URL => “some_url”] will get populated with additional fields:
[CrawlConstants::URL => “some_url”,
CrawlConstants::PAGE => “downloaded_page”,
. . . etc
]
The list of CrawlConstants codes can be found in src/library/CrawlConstants.php . The point of
using CrawlConstants::PAGE rather than “PAGE” is to catch errors caused by slightly mistyping
the field name. I.e., CrawlConstants::PAGEE will cause an error if it is not defined but “PAGEE”
won’t. The constants are strings to make for more efficient serialization.
- Professor Pollett
(
Edited: 2022-03-21)
========= Array Structure for getPages() ===========
It is an array of arrays so you can download multiple web pages at the same time. After download
a subarray [CrawlConstants::URL => “some_url”] will get populated with additional fields:
[CrawlConstants::URL => “some_url”,
CrawlConstants::PAGE => “downloaded_page”,
. . . etc
]
The list of CrawlConstants codes can be found in src/library/CrawlConstants.php . The point of
using CrawlConstants::PAGE rather than “PAGE” is to catch errors caused by slightly mistyping
the field name. I.e., CrawlConstants::PAGEE will cause an error if it is not defined but “PAGEE”
won’t. The constants are strings to make for more efficient serialization.
- Professor Pollett