-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathpreface.tex
28 lines (26 loc) · 1.72 KB
/
preface.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
\renewcommand{\abstractname}{Preface}
\begin{abstract}
Nowadays one can speak of an ongoing revolution in web systems which turns
previous demands upside down. Internet companies created a need for activity
data in an extremely high volume which has never existed before. And
the trend holds on -- not only companies with its core business in the cloud
also more and more companies with a traditionally
fashioned core business are tending to create huge amount of data. This will
sooner or later surpass the yet so overwhelming data volume of the internet
big five (Apple, Microsoft, Google, Amazon, Facebook). The data those
companies generate capture user and server activity. It is at the heart
of many internet systems in the domains of advertising, relevance, search,
recommendation systems, and security, as well as continuing to fulfil its
traditional role in analytics and reporting. Processing of the so called
activity data has changed too. The days are gone of nightly batch processing
as the only way to produce valuable information -- real-time processing
dominates the field now, where information is available as soon as data is
generated. Being able to serve data to batch- and real-time processing systems
requires a central hub that can handle data-load as well as latency. There are
many alternative products on the market which are able to provide the
necessary capabilities up to a certain point, but among some others (big five)
LinkedIn found himself in the unsatisfied position where any existing system
was able to process their data load and was scalable at the same time. Thus,
they created Kafka, which was soon open sourced under the Apache license,
known as Apache Kafka.
\end{abstract}