You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+7-7Lines changed: 7 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -225,22 +225,22 @@ In a nutshell, the Crawlbot crawls a set of seed URLs for links (even if a subdo
225
225
226
226
A joint list of all your crawl / bulk jobs can be fetched via:
227
227
228
-
```
228
+
```php
229
229
$diffbot = new Diffbot('my_token');
230
230
$jobs = $diffbot->crawl()->call();
231
231
```
232
232
233
233
This returns a collection of all crawl and bulk jobs. Each type is represented by its own class: `JobCrawl` and `JobBulk`. It's important to note that Jobs only contain the information about the job - not the data. To get the data of a job, use the `downloadUrl` method to get the URL to the dataset:
234
234
235
-
```
235
+
```php
236
236
$url = $job->downloadUrl("json");
237
237
```
238
238
239
239
### Crawl jobs: Creating a Crawl Job
240
240
241
241
See inline comments for step by step explanation
242
242
243
-
```
243
+
```php
244
244
// Create new diffbot as usual
245
245
$diffbot = new Diffbot('my_token');
246
246
@@ -266,7 +266,7 @@ dump($job->getDownloadUrl("json")); // outputs download URL to JSON dataset of t
266
266
267
267
To get data about a job (this will be the data it was configured with - its flags - and not the results!), use the exact same approach as if creating a new one, only without the API and seeds:
268
268
269
-
```
269
+
```php
270
270
$diffbot = new Diffbot('my_token');
271
271
272
272
$crawl = $diffbot->crawl('sitepoint_01');
@@ -282,7 +282,7 @@ While there is no way to alter a crawl job's configuration post creation, you ca
282
282
283
283
Provided you fetched a `$crawl` instance as in the above section on inspecting, you can do the following:
284
284
285
-
```
285
+
```php
286
286
// Force start of a new crawl round manually
287
287
$crawl->roundStart();
288
288
@@ -301,13 +301,13 @@ Note that it is not necessary to issue a `call()` after these methods.
301
301
302
302
If you would like to extract the generated API call URL for these instant-call actions, pass in the parameter `false`, like so:
303
303
304
-
```
304
+
```php
305
305
$crawl->delete(false);
306
306
```
307
307
308
308
You can then save the URL for your convenience and call `call` when ready to execute (if at all).
0 commit comments