You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
echo $articleApi->call()->author; // prints out "Bruno Skvorc"
32
+
```
33
+
34
+
That's it, this is all you need to get started.
35
+
36
+
## Usage - advanced
37
+
38
+
Full API reference manual in progress, but the instructions below should do for now - the library was designed with brutal UX simplicity in mind.
39
+
40
+
### Setup
41
+
42
+
To begin, always create a Diffbot instance. A Diffbot instance will spawn API instances.
43
+
To get your token, sign up at http://diffbot.com
44
+
45
+
```php
46
+
$diffbot = new Diffbot('my_token');
47
+
```
48
+
49
+
### Pick API
50
+
51
+
Then, pick an API.
52
+
53
+
Currently available [*automatic*](http://www.diffbot.com/products/automatic/) APIs are:
54
+
55
+
-[product](http://www.diffbot.com/products/automatic/product/) (crawls products and their reviews, if available)
56
+
-[article](http://www.diffbot.com/products/automatic/article/) (crawls news posts, blogs, etc, with comments if available)
57
+
-[image](http://www.diffbot.com/products/automatic/image/) (fetches information about images - useful for 500px, Flickr etc). The Image API can return several images - depending on how many are on the page being crawled.
58
+
-[discussion](http://www.diffbot.com/products/automatic/discussion/) (fetches discussion / review / comment threads - can be embedded in the Product or Article return data, too, if those contain any comments or discussions)
59
+
-[analyze](http://www.diffbot.com/products/automatic/analyze/) (combines all the above in that it automatically determines the right API for the URL and applies it)
60
+
61
+
Video is coming soon.
62
+
63
+
There is also a [Custom API](http://www.diffbot.com/products/custom/) like [this one](http://www.sitepoint.com/analyze-sitepoint-author-portfolios-diffbot/) - unless otherwise configured, they return instances of the Wildcard entity)
64
+
65
+
All APIs can also be tested on http://diffbot.com
66
+
67
+
The API you picked can be spawned through the main Diffbot instance:
All APIs have some optional fields you can pass with parameters. For example, to extract the 'meta' values of the page alongside the normal data, call `setMeta`:
76
+
77
+
```php
78
+
$api->setMeta(true);
79
+
```
80
+
81
+
Some APIs have other flags that don't qualify as fields. For example, the Article API can be told to ignore Discussions (aka to not extract comments). This can speed up the fetching, because by default, it does look for them. The configuration methods all have the same format, though, so to accomplish this, just use `setDiscussion`:
82
+
83
+
```php
84
+
$api->setDiscussion(false);
85
+
```
86
+
87
+
All config methods are chainable:
88
+
89
+
```php
90
+
$api->setMeta(true)->setDiscussion(false);
91
+
```
92
+
93
+
### Calling
94
+
95
+
All API instances have the `call` method which returns a collection of results. The collection is iterable:
In cases where only one entity is returned, like Article or Product, iterating works all the same, it just iterates through through the one single elements. The return data is **always** a collection!
118
+
119
+
However, for brevity, you can access properties directly on the collection, too.
In this case, the collection applies the property call to the first element which, coincidentally, is also the only element. If you use this approach on the image collection above, the same thing happens - but the call is only applied to the first image entity in the collection.
128
+
129
+
### Just the URL, please
130
+
131
+
If you just want the final generated URL (for example, to paste into Postman Client or to test in the browser and get pure JSON), use `buildUrl`:
132
+
133
+
```php
134
+
$url = $articleApi->buildUrl();
135
+
```
136
+
137
+
You can continue regular API usage afterwards, which makes this very useful for logging, etc.
138
+
139
+
### Pure response
140
+
141
+
You can extract the pure, full Guzzle Response object from the returned data and then manipulate it as desired (maybe parsing it as JSON and processing it further on your own):
Individual entities do not have access to the response - to fetch it, always fetch from their parent collection (the object that the `call()` method returns).
149
+
150
+
### Discussion and Post
151
+
152
+
The Discussion API returns some data about the discussion and contains another collection of Posts. A Post entity corresponds to a single review / comment / forum post, and is very similar in structure to the Article entity.
An Article or Product entity can contain a Discussion entity. Access it via `getDiscussion` on an Article or Product entity and use as usual (see above).
188
+
189
+
## Custom API
190
+
191
+
Used just like all others. There are only two differences:
192
+
193
+
1. When creating a Custom API call, you need to pass in the API name
194
+
2. It always returns Wildcard entities which are basically just value objects containing the returned data. They have `__call` and `__get` magic methods defined so their properties remain just as accessible as the other Entities', but without autocomplete.
195
+
196
+
The following is a usage example of my own custom API for author profiles at SitePoint:
Of course, you can easily extend the basic Custom API class and make your own, as well as add your own Entities that perfectly correspond to the returned data. This will all be covered in a tutorial in the near future.
25
211
26
212
## Testing
27
213
214
+
Just run PHPUnit in the root folder of the cloned project.
215
+
Some calls do require an internet connection (see `tests/Factory/EntityTest`).
216
+
28
217
```bash
29
218
phpunit
30
219
```
31
220
221
+
### Adding Entity tests
222
+
223
+
**I'll pay $10 for every new set of 5 Entity tests, submissions verified set per set - offer valid until I feel like there's enough use cases covered.** (a.k.a. don't submit 1500 of them at once, I can't pay that in one go).
224
+
225
+
If you would like to contribute by adding Entity tests, I suggest following this procedure:
226
+
227
+
1. Pick an API you would like to contribute a test for. E.g., Product API.
228
+
2. In a scratchpad like `index.php`, build the URL:
229
+
230
+
```php
231
+
$diffbot = new Diffbot('my_token');
232
+
$url = $diffbot
233
+
->createProductAPI('http://someurl.com')
234
+
->setMeta(true)
235
+
->...(insert other config methods here as desired)...
236
+
->buildUrl();
237
+
echo $url;
238
+
```
239
+
240
+
3. Grab the URL and paste it into a REST client like Postman or into your browser. You'll get Diffbot's response back. Keep it open for reference.
241
+
4. Download this response, with headers, into a JSON file. Preferably into `tests/Mocks/Products/[date]/somefilename.json`, like the other tests are. This is easily accomplished by executing `curl -i "[url] > somefilename.json"` in the Terminal/Command Line.
242
+
5. Go into the appropriate tests folder. In this case, `tests/Entity` and open `ProductTest.php`. Notice how the file is added into the batch of files to be tested against. Every provider has it referenced, along with the value the method being tested should produce. Slowly go through every test method and add your file. Use the values in the JSON you got in step 3 to get the values.
243
+
6. Run `phpunit tests/Entity/ProductTest.php` to test just this file (much faster than entire suite). If OK, send PR :)
244
+
245
+
If you'd like to create your own Test classes, too, that's fine, no need to extend the ones that are included with the project. Apply the whole process just rather than extending the existing `ProductTest` class make a new one.
246
+
247
+
### Adding other tests
248
+
249
+
Other tests don't have specific instructions, contribute as you see fit. Just try to minimize actual remote calls - we're not testing the API itself (a.k.a. Diffbot), we're testing this library. If the library parses values accurately from an inaccurate API response because, for example, Diffbot is currently bugged, that's fine - the library works!
250
+
32
251
## Contributing
33
252
34
253
Please see [CONTRIBUTING](CONTRIBUTING.md) for details and [TODO](TODO.md) for ideas.
Copy file name to clipboardExpand all lines: TODO.md
+2-3Lines changed: 2 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -9,14 +9,13 @@ Active todos, ordered by priority
9
9
10
10
## Medium
11
11
12
-
- write usage example
13
12
- add streaming to Crawlbot - make it stream the result (it constantly grows)
14
13
- implement Video API (currently beta)
15
-
- improve Custom API
16
-
- improve Wildcard Entity - apply to Custom API
14
+
- add test case with mock for product that has discussion (Amazon?)
17
15
18
16
## Low
19
17
18
+
- add more usage examples
20
19
- work on PhpDoc consistency ($param type vs type $param)
21
20
- get more mock responses and test against them
22
21
- write example with custom EntityIterator (different Entity set for different API) and custom Entity (i.e. authorProfile, which parses some of the data and prepares for further use)
0 commit comments