Skip to content

Popular repositories Loading

  1. DataGenerator DataGenerator Public

    DataGenerator is a Java library for systematically producing large volumes of data. DataGenerator frames data production as a modeling problem, with a user providing a model of dependencies among v…

    Java 163 170

  2. herd herd Public

    Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and an…

    Java 135 41

  3. yum-nginx-api yum-nginx-api Public

    yum-nginx-api is a go API for uploading RPMs to yum repositories and configurations for running NGINX to serve them. It is a deployable solution with Docker or a single 8MB statically linked Linux …

    Go 51 22

  4. MegaSparkDiff MegaSparkDiff Public

    A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multipl…

    Scala 51 27

  5. HiveQLUnit HiveQLUnit Public archive

    Test your Hive scripts inside your favorite IDE with HiveQLUnit! Increase your developers productivity by testing on all operating systems including Windows, Linux and Mac OSX. Build continuous int…

    Java 39 13

  6. aphelion aphelion Public

    Aphelion is a web application that captures and visualizes your AWS services usage limits. It continuously collects data in the background and you can visualize the data in easy-to-see graphs and c…

    Java 34 10

Repositories

Showing 10 of 23 repositories

Top languages

Loading…

Most used topics

Loading…