Skip to content

Extractor Gem to analyse random strings for profile links of user. E.g. User uploads a PDF, scan it for all references to a social network profile.

License

Notifications You must be signed in to change notification settings

pludoni/network_profile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

c33f063 · May 6, 2024

History

14 Commits
Sep 22, 2020
Sep 22, 2020
May 6, 2024
Nov 2, 2020
Oct 8, 2020
Sep 22, 2020
Sep 22, 2020
Nov 2, 2020
Jun 30, 2021
Sep 22, 2020
Nov 2, 2020
Sep 22, 2020
Nov 2, 2020

Repository files navigation

NetworkProfile

Gem Version

Extractor Gem to analyse random strings for profile links of user. E.g. User uploads a PDF, scan it for all references to a social network profile.

This work is extracted from the German Applicant Tracking System EBMS (https://bms.empfehlungsbund.de).

Installation

Add this line to your application's Gemfile:

gem 'network_profile'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install network_profile

Usage

Parse and extract one link

extraction = NetworkProfile.parse('https://github.com/zealot128', include_fallback_custom: true)
  • include_fallback_custom: true uses the default extractor (og/meta-tags) if no other more specific extractor is found
  • include_fallback_custom: false only use the specific website extractors and return nil if none matches the link

Scan a whole long string for links

links = NetworkProfile::Extractor.call("Very long String with even broken links in it www . github . com/zealot128")

Config

NetworkProfile.headers = {
  'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
  'Accept-Language' => 'de,en-US;q=0.7,en;q=0.3',
  'Referer' => 'https://www.google.com',
  'DNT' => '1',
  'User-Agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:73.0) Gecko/20100101 Firefox/73.0',
}
NetworkProfile.github_api_key = nil

Extractor

The following network profiles are supported:

GithubProfile/Company, GithubProject:

  • uses GH's GraphQL API (Thus a API KEY is required)

Instagram Facebook Linkedin

  • Because those websites are closed and defensive as hell, there is no extraction, just a simple matching (e.g. "Facebook profile")

Stackoverflow

  • Uses SO's API

Upwork XING ResearchGate

  • Custom Website Scraper/Extract JSON+LD

Default Fallback (Custom)

  • OG-Meta-Tags / HTML-Meta-Tags

License

The gem is available as open source under the terms of the MIT License.

About

Extractor Gem to analyse random strings for profile links of user. E.g. User uploads a PDF, scan it for all references to a social network profile.

Topics

Resources

License

Stars

Watchers

Forks