|
| 1 | +# ScrapeGraphAI Java SDK Examples |
| 2 | + |
| 3 | +This module contains comprehensive examples demonstrating all features of the ScrapeGraphAI Java SDK. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +1. **API Key**: Set your ScrapeGraphAI API key as an environment variable: |
| 8 | + ```bash |
| 9 | + export SCRAPEGRAPHAI_API_KEY="your_api_key_here" |
| 10 | + ``` |
| 11 | + |
| 12 | + Or set it as a system property: |
| 13 | + ```bash |
| 14 | + -Dscrapegraphai.apiKey=your_api_key_here |
| 15 | + ``` |
| 16 | + |
| 17 | +2. **Java 8+**: This SDK requires Java 8 or later. |
| 18 | + |
| 19 | +## Running Examples |
| 20 | + |
| 21 | +### Basic Example |
| 22 | +Run the main example (basic SmartScraper usage): |
| 23 | +```bash |
| 24 | +./gradlew :scrapegraphai-java-example:run |
| 25 | +``` |
| 26 | + |
| 27 | +### Specific Examples |
| 28 | +Run any specific example using the `-Pexample` parameter: |
| 29 | + |
| 30 | +```bash |
| 31 | +# Comprehensive SmartScraper examples |
| 32 | +./gradlew :scrapegraphai-java-example:run -Pexample=Smartscraper |
| 33 | + |
| 34 | +# Web search and scraping examples |
| 35 | +./gradlew :scrapegraphai-java-example:run -Pexample=Searchscraper |
| 36 | + |
| 37 | +# Website crawling examples |
| 38 | +./gradlew :scrapegraphai-java-example:run -Pexample=Crawl |
| 39 | + |
| 40 | +# HTML to Markdown conversion examples |
| 41 | +./gradlew :scrapegraphai-java-example:run -Pexample=Markdownify |
| 42 | + |
| 43 | +# Asynchronous operation examples |
| 44 | +./gradlew :scrapegraphai-java-example:run -Pexample=Async |
| 45 | + |
| 46 | +# Schema generation examples |
| 47 | +./gradlew :scrapegraphai-java-example:run -Pexample=SchemaGeneration |
| 48 | + |
| 49 | +# Utility service examples (validation, health, credits) |
| 50 | +./gradlew :scrapegraphai-java-example:run -Pexample=Utility |
| 51 | +``` |
| 52 | + |
| 53 | +## Example Overview |
| 54 | + |
| 55 | +| Example File | Description | Key Features | |
| 56 | +|-------------|-------------|--------------| |
| 57 | +| **Main.java** | Basic SmartScraper usage | Simple scraping, error handling, getting started | |
| 58 | +| **SmartscraperExample.java** | Comprehensive scraping scenarios | Custom schemas, pagination, JavaScript rendering, headers/cookies, result retrieval | |
| 59 | +| **SearchscraperExample.java** | Web search + data extraction | Multi-source aggregation, product comparison, news collection, academic research | |
| 60 | +| **CrawlExample.java** | Website crawling operations | Site exploration, path filtering, depth control, progress monitoring | |
| 61 | +| **MarkdownifyExample.java** | HTML to Markdown conversion | Content formatting, documentation generation, custom options | |
| 62 | +| **AsyncExample.java** | Asynchronous operations | Non-blocking requests, parallel processing, chaining operations, error handling | |
| 63 | +| **SchemaGenerationExample.java** | Automatic schema creation | Schema analysis, structured data extraction, schema-driven scraping | |
| 64 | +| **UtilityExample.java** | Service utilities | API validation, health checks, credit monitoring, feedback submission | |
| 65 | + |
| 66 | +## Example Details |
| 67 | + |
| 68 | +### 🕷️ SmartScraper Examples |
| 69 | +- Basic product information extraction |
| 70 | +- Structured data with custom JSON schemas |
| 71 | +- Dynamic content requiring JavaScript rendering |
| 72 | +- Pagination handling |
| 73 | +- Custom headers and cookies for authenticated scraping |
| 74 | +- Retrieving results by request ID |
| 75 | + |
| 76 | +### 🔍 SearchScraper Examples |
| 77 | +- General web search and data aggregation |
| 78 | +- Product comparison across multiple e-commerce sites |
| 79 | +- News aggregation from various sources |
| 80 | +- Academic research paper collection |
| 81 | +- Search progress monitoring |
| 82 | + |
| 83 | +### 🕸️ Crawl Examples |
| 84 | +- Basic website crawling |
| 85 | +- E-commerce catalog exploration |
| 86 | +- Blog and news site crawling |
| 87 | +- Documentation site mapping |
| 88 | +- Crawl progress monitoring and result retrieval |
| 89 | + |
| 90 | +### 📝 Markdownify Examples |
| 91 | +- Basic web page to Markdown conversion |
| 92 | +- Blog post formatting |
| 93 | +- Documentation page conversion |
| 94 | +- Direct HTML content processing |
| 95 | +- Conversion progress monitoring |
| 96 | + |
| 97 | +### ⚡ Async Examples |
| 98 | +- Basic asynchronous operations |
| 99 | +- Parallel processing of multiple URLs |
| 100 | +- Chaining async operations |
| 101 | +- Mixed service operations |
| 102 | +- Comprehensive error handling |
| 103 | + |
| 104 | +### 🏗️ Schema Generation Examples |
| 105 | +- E-commerce page schema generation |
| 106 | +- Blog content schema creation |
| 107 | +- Product listing schemas |
| 108 | +- Using generated schemas with SmartScraper |
| 109 | +- Schema analysis and monitoring |
| 110 | + |
| 111 | +### 🛠️ Utility Examples |
| 112 | +- API key validation and user information |
| 113 | +- Service health checks |
| 114 | +- Credit balance monitoring |
| 115 | +- Feedback submission |
| 116 | +- Comprehensive service status checks |
| 117 | + |
| 118 | +## Common Use Cases |
| 119 | + |
| 120 | +### E-commerce Data Extraction |
| 121 | +```java |
| 122 | +// Use SmartscraperExample for product details |
| 123 | +// Use SearchscraperExample for price comparison |
| 124 | +// Use CrawlExample for catalog exploration |
| 125 | +``` |
| 126 | + |
| 127 | +### Content Management |
| 128 | +```java |
| 129 | +// Use MarkdownifyExample for content conversion |
| 130 | +// Use CrawlExample for site mapping |
| 131 | +// Use SchemaGenerationExample for consistent data structure |
| 132 | +``` |
| 133 | + |
| 134 | +### Research and Analysis |
| 135 | +```java |
| 136 | +// Use SearchscraperExample for information gathering |
| 137 | +// Use SmartscraperExample for specific data extraction |
| 138 | +// Use AsyncExample for high-volume processing |
| 139 | +``` |
| 140 | + |
| 141 | +### Monitoring and Maintenance |
| 142 | +```java |
| 143 | +// Use UtilityExample for service health checks |
| 144 | +// Use AsyncExample for performance optimization |
| 145 | +// Use CrawlExample for site monitoring |
| 146 | +``` |
| 147 | + |
| 148 | +## Error Handling |
| 149 | + |
| 150 | +All examples include comprehensive error handling patterns: |
| 151 | +- Network connectivity issues |
| 152 | +- API key validation problems |
| 153 | +- Rate limiting and quota management |
| 154 | +- Service availability checks |
| 155 | +- Data parsing and validation errors |
| 156 | + |
| 157 | +## Best Practices Demonstrated |
| 158 | + |
| 159 | +1. **Resource Management**: Proper client initialization and cleanup |
| 160 | +2. **Error Handling**: Graceful degradation and user-friendly error messages |
| 161 | +3. **Async Operations**: Efficient parallel processing and non-blocking operations |
| 162 | +4. **Configuration**: Environment-based configuration management |
| 163 | +5. **Monitoring**: Service health and usage monitoring |
| 164 | +6. **Documentation**: Clear code comments and usage examples |
| 165 | + |
| 166 | +## Getting Help |
| 167 | + |
| 168 | +- Check the main [README.md](../README.md) for general SDK documentation |
| 169 | +- Review the [API documentation](https://docs.scrapegraphai.com) for detailed endpoint information |
| 170 | +- Run the UtilityExample to verify your setup and check service status |
| 171 | +- Use the feedback functionality in UtilityExample to report issues or suggestions |
| 172 | + |
| 173 | +## Contributing |
| 174 | + |
| 175 | +If you have ideas for additional examples or improvements to existing ones, please: |
| 176 | +1. Fork the repository |
| 177 | +2. Create a new example following the existing patterns |
| 178 | +3. Add appropriate documentation |
| 179 | +4. Submit a pull request |
| 180 | + |
| 181 | +Examples should be: |
| 182 | +- Self-contained and runnable |
| 183 | +- Well-documented with clear explanations |
| 184 | +- Include error handling |
| 185 | +- Demonstrate best practices |
| 186 | +- Cover real-world use cases |
0 commit comments