[DNSOP] ENTRADA goes open source (.nl Hadoop platform)

"Giovane C. M. Moura" <giovane.moura@sidn.nl> Thu, 28 January 2016 11:29 UTC

Return-Path: <giovane.moura@sidn.nl>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ABC221B3BCF for <dnsop@ietfa.amsl.com>; Thu, 28 Jan 2016 03:29:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.793
X-Spam-Level: **
X-Spam-Status: No, score=2.793 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HELO_EQ_NL=0.55, HOST_EQ_NL=1.545, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FX1z-HuekEyv for <dnsop@ietfa.amsl.com>; Thu, 28 Jan 2016 03:29:33 -0800 (PST)
Received: from arn2-kamx.sidn.nl (kamx.sidn.nl [IPv6:2a00:d78:0:147:94:198:152:69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C41541B3BD0 for <dnsop@ietf.org>; Thu, 28 Jan 2016 03:29:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; d=sidn.nl; s=sidn-nl; c=relaxed/relaxed; h=to:from:subject:message-id:date:user-agent:mime-version:content-type:content-transfer-encoding:x-originating-ip:x-clientproxiedby; bh=q6PrTbihuFJt2mVIqfNtCtQKxDmRGpcdTYq2OCX1Tzk=; b=GrOI4Az1P5h7tOtnUINOsJZ7XHvKVqFKM3RZxwK4mCxYXciHEF7MkwvZN+n2RDMzgIEKuHkTmyC16KN1MRocJR+eK/0whPv7MinYtijepUPMPsW3nKhJnvS3xM1f5oBx88KC9pvuBC30dpifyhwZnsf9/h6UMbbv+p1+kjw/1CLdH7VW18Cx9jr4T4DcFauIhAtkBcnuv5byeKeySvsAgMIwolrS8drOqFEeaLi0TeYAZxsC7aUIkMLCOWVFYfWXR5YOuZxc70PwrwGI9IKIqZWmmqcfxYRMvEUIAJ2zsHC1N1YDYXb/ALm5dqkgCT5j1Z8oDUeY6nZWCqul8xqipA==
Received: from ka-mbx02.SIDN.local ([192.168.2.178]) by arn2-kamx.sidn.nl with ESMTP id u0SBTUOU015704-u0SBTUOW015704 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=CAFAIL) for <DNSOP@ietf.org>; Thu, 28 Jan 2016 12:29:30 +0100
Received: from [94.198.159.130] (94.198.159.130) by ka-mbx02.SIDN.local (192.168.2.178) with Microsoft SMTP Server (TLS) id 15.0.1130.7; Thu, 28 Jan 2016 12:29:31 +0100
To: DNSOP@ietf.org
From: "Giovane C. M. Moura" <giovane.moura@sidn.nl>
Message-ID: <56A9FB99.9080406@sidn.nl>
Date: Thu, 28 Jan 2016 12:29:29 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [94.198.159.130]
X-ClientProxiedBy: ka-hubcasn01.SIDN.local (192.168.2.171) To ka-mbx02.SIDN.local (192.168.2.178)
Archived-At: <http://mailarchive.ietf.org/arch/msg/dnsop/ldFbn5VzaPypKAschsklWXNjmlY>
Subject: [DNSOP] ENTRADA goes open source (.nl Hadoop platform)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jan 2016 11:29:35 -0000

*************************************************************
ENTRADA GOES OPEN SOURCE
http://entrada.sidnlabs.nl/
*************************************************************

SIDN Labs [1] is happy to announce that we are releasing our ENTRADA
platform as an open source project [2]. ENTRADA is a Hadoop-based
platform designed to ingest and quickly analyze large amounts of network
data. More technically, it is a high-performance data streaming
warehouse (DSW).

Please refer to our research paper [3] for a performance evaluation and
more details.

WHO CAN USE IT:

* Internet measurement researchers who are in need of a
high-performance analytics platform
* Domain name registries and other network operators interested in
developing DNS big data applications (e.g:[4])

MAIN FEATURES:

* Performance: analyze the Parquet equivalent of 50 TB of pcap data
in under 3.5 minutes, with a small 6-node cluster (4 data processing
nodes).
* Interface: benefit from easy SQL statements to analyze your data
* Scalable: add more nodes for faster processing and more storage
* Built-in support for DNS (and TCP/UDP/IP fields too) and ICMP
network data; other protocols may be added too.

DEPLOYMENT:

We have been using ENTRADA at SIDN Labs for the past 1.5 years. It
currently runs on a relatively small 6 node cluster (with 4 data-nodes),
which store DNS traffic from the .nl authoritative servers. In total,
ENTRADA has more than 100 billion queries and responses stored, and 400
million are added on a daily basis. It can also be deployed on a cloud
environment.

We also make available open aggregated datasets at [5].


ABOUT SIDN AND SIDN LABS

SIDN [6] manages .nl, the country code TLD of the Netherlands.
SIDN Labs [1] is SIDN's R&D team, which develops
and evaluates new technologies and systems to improve both stability and
security of .nl zone.


References:

[1] https://www.sidnlabs.nl/
[2] http://entrada.sidnlabs.nl/
[3] https://www.sidnlabs.nl/sidn-noms2016.pdf
[4] http://iepg.org/2015-11-01-ietf94/iepg-moura.pdf
[5] http://stats.sidnlabs.nl
[6] https://sidn.nl