The Apache News Round-up: week ending 31 March 2017 (Apache Software Foundation Blogs)

We've had a great week! Here's what happened:

The Apache Software Foundation Announces 18 Years of Open Source Leadership https://s.apache.org/DHlr
 - We invite you to make a contribution and help ensure Apache projects continue to be freely available http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Welcome new ASF Board members Rich Bowen, Shane Curcuru, Bertrand Delacretaz, Ted Dunning, Jim Jagielski, Chris Mattmann, Brett Porter, Phil Steitz, and Mark Thomas.
 - Next Board Meeting: 19 April 2017. Board calendar and minutes available at http://apache.org/foundation/board/calendar.html
 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield reliable performance at 97.26% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today.
 - Learn the latest in Big Data, Cloud, Flex, IoT, Tomcat, and dozens of other leading Apache projects across 100+ sessions, 75+ speakers, 4 subconferences, BarCampApache and more. https://blogs.apache.org/conferences/entry/apachecon-tomorrow-s-software-today Sign up today and save $200!
 - Become an Apache Community Sponsor at ApacheCon http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache ActiveMQ™ – the most popular and powerful Open Source Message Broker.
 - Apache ActiveMQ Artemis 1.5.4 released http://activemq.apache.org/artemis/

Apache Atlas (incubating) –a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Apache Hadoop and allows integration with the whole enterprise Big Data ecosystem.
 - Apache Atlas 0.8-incubating released http://www.apache.org/dyn/closer.cgi/incubator/atlas/0.8.0-incubating/

Apache Buildr™ –a build system for Java-based applications, including support for Scala, Groovy and a growing number of JVM languages and tools.
 - Apache Buildr 1.5.1 released http://buildr.apache.org/

Apache Calcite™ –a dynamic Big Data management framework.
 - Apache Calcite 1.12.0 released http://www.apache.org/dyn/closer.cgi/calcite/apache-calcite-1.12.0/

Apache Edgent (incubating) –a stream processing programming model and lightweight micro-kernel style runtime to execute analytics at devices on the edge or at the gateway.
 - Apache Edgent 1.1.0-incubating released https://edgent.apache.org/docs/downloads.html

Apache FreeMarker (incubating) –a template engine: a Java library to generate text output (HTML web pages, e-mails, configuration files, source code, etc.) based on templates and changing data.
 - Apache FreeMarker 2.3.26-incubating released http://freemarker.org/freemarkerdownload.html

Apache HBase™ –an Open Source, distributed, versioned, non-relational database.
 - Apache HBase 1.2.5 released https://www.apache.org/dyn/closer.lua/hbase/1.2.5

Apache Lucene™ –a high-performance, full-featured text search engine library written entirely in Java.
 - Apache Lucene 6.5.0 released http://lucene.apache.org/core/mirrors-core-latest-redir.html
 - Apache Solr 6.5.0 released http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Apache Quickstep (incubating) –a high-performance SQL database that auto-manages configuration.
 - Apache Quickstep 0.1.0-incubating released https://quickstep.incubator.apache.org/release

Apache Parquet™ –a general-purpose columnar file format supporting nested data.
 - Apache Parquet C++ 1.0.0 released https://www.apache.org/dyn/closer.cgi/parquet/apache-parquet-cpp-1.0.0/

Apache Storm™ –a distributed, fault-tolerant, and high-performance realtime computation system that provides strong guarantees on the processing of data.
 -Apache Storm 1.1.0 released http://storm.apache.org/downloads.html

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - Apache Tomcat 9.0.0.M19 released http://tomcat.apache.org/download-90.cgi

Apache UIMA™ –a component architecture and framework for the analysis of unstructured content like text, video and audio data.
 - Apache UIMA Java SDK 3.0.0-alpha02 released http://uima.apache.org/d/uimaj-3.0.0-alpha02/version_3_users_guide.html
 - Apache uimaFIT 2.3.0 released http://uima.apache.org/downloads.cgi#Latest Official Releases

Did You Know?

 - Did you know that we just held the annual ASF Members' Meeting? This is where new Members are elected; results will be announced at the end of next month. https://www.apache.org/foundation/governance/members.html

 - Did you know that Apache ActiveMQ, Ant, and Maven were named among the Top 50 DevOps Tools? https://stackify.com/top-devops-tools/

 - Did you know that Apache's 100M+ lines of code (LOCs) were developed over 65,000 person years and valued at US$7B? https://s.apache.org/DHlr


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ

 - We have been chosen as a Google Summer of Code (GSoC) Mentoring Organization for the 12th consecutive year --Apache Committer Mentors wanted! https://summerofcode.withgoogle.com/organizations/5416945173135360/

 - Feedback from The Apache Software Foundation on the Free and Open Source Security Audit (FOSSA) https://s.apache.org/romf

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The new Apache Community Facebook page is https://www.facebook.com/ApacheSoftwareFoundation/ Do friend and follow us. 

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

#DialUp the Web’s inventor for online security and rights (Defective by Design blogs)

An image of a telephone with overlaid text that reads '#DialUp to save the Web from DRM. +1 (617) 253-5702. Tell the Web's inventor: don't endanger our security and rights!'

Update: Since April 13th, Tim Berners-Lee has had the capability to ratify EME as an official W3C recommendation. He is taking longer than usual to make his decision, which the W3C attributes to the large amount of feedback from people like you, which he needs to consider. Keep the calls coming!

Since the beginning of the Web—the age of dial-up Internet connections—the W3C (World Wide Web Consortium) has kept the Web's technical standards tuned in a careful balance that enables innovation and respects users' rights.

On April 13th, that may change. Unless we can stop it, the W3C will welcome a new wave of user-hostile DRM (Digital Restrictions Management) onto the Web, making it harder than ever for us to be secure and free online.

The wave of DRM will come from the W3C's ratification of a proposed technical standard, EME (Encrypted Media Extensions), which will make it cheaper and easier for streaming video companies to build DRM into Web sites. That will invite more abuses of users like the Digital Editions DRM, which was found to be exposing user information to snoopers, and more digital locks preventing important, legal things that people do with media, like accessibility modifications, translation, commentary, and archiving.

Netflix, Apple, Google, and Microsoft are dead-set on EME. They are powerful—and their membership dues provide a lot of money to the W3C—but there is a weak link in their plan: Tim Berners-Lee, the Director of the W3C, can block EME when it comes to his desk on April 13th.

This is where you come in: #DialUp Tim Berners-Lee now and urge him not to endanger users by enshrining oppressive technology in the basic standards of the Web.

How to #DialUp for the free Web

1: Call Tim Berners-Lee's publicly listed W3C phone number: +1 (617) 253-5702. You will most likely reach his assistant or an answering machine.

2: Be polite and get straight to the point. We recommend you say:

Hi, my name is [NAME] and I am calling to urge Tim Berners-Lee to prevent Encrypted Media Extensions from becoming a W3C recommendation. I believe that the Web should promote user freedom and security, not undermine them. Please make sure that Mr. Berners-Lee receives this message. Thank you.

3: Click this link to email us, so we can announce how many people called Berners-Lee (this link does not email Tim Berners-Lee). You will not be automatically added to our email list, but you can join here.

Can't call the US? Call your closest W3C office (they are spread across the globe) and ask them to relay your message to Tim Berners-Lee.
Share on GNU Social or Twitter.


Though EME is designed specifically for streaming video DRM, its ratification would catalyze long-simmering projects to add DRM to text and image standards on the Web. Perhaps worst of all, EME's ratification would lend political capital to the laws that make it a crime to circumvent DRM, even for security research, which is a crucial public accountability mechanism for software.

But we don't have to have to settle for this. There is deep opposition to EME within the W3C community—unsurprisingly, most of the people who dedicate their lives to improving the Web don't want DRM enshrined in it. Berners-Lee himself sees EME less as an improvement to the Web, and more as a necessary evil to placate streaming companies and Hollywood. If we can show him an unprecedented grassroots demand to reject DRM in Web standards, we have a chance to win on April 13th. Stand up for the free Web you love, and #DialUp Tim Berners-Lee now.

Posted in Uncategorized Tagged

The Apache® Software Foundation Announces 18 Years of Open Source Leadership (Apache Software Foundation Blogs)

Billions of users depend on Apache's free, community-driven software; Foundation relies on charitable donations to advance the future of open development.

Forest Hill, MD —28 March 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today its 18th Anniversary and accomplishments, and rallied support to ensure future innovation.

Millions of people have depended on Apache software since the ASF's beginnings in 1999. Today, the ASF has grown to become a leading source for hundreds of Open Source software projects that meet the demand for interoperable, adaptable, and sustainable solutions. Apache's enterprise-grade projects power mission-critical applications in financial services, aerospace, publishing, government, healthcare, research, infrastructure, and more. 

"We are proud to celebrate our 18th Anniversary as one of the strongest and most emulated communities in Open Source," said ASF President Sam Ruby. "The ubiquity of Apache software across the globe attests to the trust in our projects and community. Our momentum demonstrates that the ASF has stayed true to our mission of producing software for the public good, and remains an influential force in accelerating the next generation of Open Source innovations."

The ASF serves approximately 9M source code downloads from Apache mirrors on a yearly basis (excluding convenience binaries). Worldwide dependency on Apache software continues to grow, with Web requests received from every Internet-connected country on the planet. https://projects.apache.org/statistics.html

Highlights include:

1) Membership --an increase of 2,952% from the inaugural 21 ASF Members to 620 Members today. ASF Membership comprises elected individuals who legally serve as the "shareholders" of the Foundation. This number is anticipated to increase following the next Apache Members’ Meeting (28-30 March 2017), when new Members are elected. https://www.apache.org/foundation/governance/members.html

2) Community --the ASF surpassed 6,000 project contributors (code and/or documentation) in February 2017. https://projects.apache.org/timelines.html . In addition to organic growth, the ASF participates in several community development activities, including Google Summer of Code (where the ASF has served as a mentoring organization since the program’s inception in 2005), as well as the ASF Travel Assistance program (that provides support to individuals attending ApacheCon, the ASF's official global conference series http://apachecon.com/ ). In addition, the Foundation maintains a Code of Conduct and initiated a new Diversity survey. http://community.apache.org/

3) Code and Contributions --the ASF is experiencing a greater influx of participants, with nearly 300 new code contributors and 300-400 new people filing issues each month. 31M (20%) of all Apache lines of code are comments --nearly three times as much as the entire Linux codebase-- underscoring the enormous scale of the ASF. The ASF's 100M functional lines of code (out of 150M+ total) have been developed over 65,000 person years, and are valued at US$7B. https://projects.apache.org/statistics.html

4) Projects --many of the ASF's 300+ projects today serve as the backbone for some of the world's most visible and widely used applications in Cloud (CouchDB, CloudStack, Mesos); Search and CMS (Derby, Jackrabbit, Lucene/Solr); DevOps and Build Management (Ant, Buildr, Maven); Servers (HTTP Web Server, Tomcat, Karaf, Traffic Server); and Web Frameworks (Flex, OFBiz, Struts), among other categories. From Abdera to ZooKeeper, Apache software continues to grow dramatically across many categories, including IoT and Edge Computing, Artificial Intelligence and Deep Learning, Mobile, and Big Data, where the Apache Hadoop ecosystem dominates the marketplace. The Apache HTTP Server, the ASF's original project, remains the most popular Web server on the planet, with 887,000 new active sites this month, totalling nearly 80M active sites. https://projects.apache.org/

5) Innovation --the Apache Incubator is home to a record 64 "podlings" undergoing development. Emerging projects span numerous categories, including Big Data, communication protocols, connected devices, cryptography, data science/machine learning/analytics, development frameworks, microfinances, remote desktop access, serverless computing, and more. A total of 280 projects have gone through the ASF's Incubation process. http://incubator.apache.org/

Each day, programmers, solutions architects, individual users, educators, researchers, corporations, governments, enthusiasts, and others select Apache software as their "go-to" choice for development tools, libraries, frameworks, visualizers, end-user productivity solutions, and more. To date, millions of software solutions have been distributed under the Apache License to allow for their free use, modification, and sharing. https://www.apache.org/licenses/

All Apache products are available to the public-at-large completely free of charge. https://www.apache.org/free/

SUPPORT APACHE

image


At the ASF, software development and project leadership is done entirely by volunteers. The ASF Board and officers are all volunteers. The collective Apache community comprises thousands of tireless individuals committed to ensuring that Open Source software remains free. Their dedication is bolstered by the billions of users who benefit from Apache software, and the many communities who adopt "The Apache Way" process as part of their own operations. https://www.apache.org/foundation/how-it-works.html

"We are often asked 'How can I help?'," said Hadrian Zbarcea, Vice President ASF Fundraising. "Our goal is to raise needed funds to help ensure that Apache software projects continue to be freely available to users around the world. We are grateful for the support from all our Sponsors and donors, and invite everyone to make a contribution, no matter the size. Every dollar counts."

As a United States private, 501(c)(3) not-for-profit charitable organization, the ASF is funded through tax-deductible contributions from corporations, foundations, and private individuals. Their collective contributions offset day-to-day operating expenses such as bandwidth and connectivity, servers and hardware, legal and accounting services, brand management and public relations, general office expenditures, and support staff. 

Approximately 75% of the ASF's US$1.2M annual budget is dedicated to running critical infrastructure support services. The ASF Infrastructure team of 10 rotating volunteers and 7 paid staff distributed on 4 continents keep Apache services running 24x7x365 at near 100% uptime on an annual budget of less than US$5,000 per project.

There are two ways to help the ASF reach its financial goals:
  • Individual Donations --for one-off or recurring gifts of any size, via credit card, PayPal, ACH, and more. 

  • ASF Sponsorship program --for organizations, individuals, foundations, and endowments contributing $5,000 or more.

Donations to the ASF help underwrite the world's largest Open Source foundation and enrich the lives of countless users and developers across the globe. Employers with matching gift programs are invited to include the ASF as part of their philanthropic programs to generously increase donations to the ASF to support its mission. http://apache.org/foundation/contributing.html

"These charitable donations are gifts that benefit the greater Apache Community," added Ruby. "We hope to be able to thank as many of our supporters as possible in person at ApacheCon in Miami this May."

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Abdera", "Apache Abdera", "Ant", "Apache Ant", "Buildr", "Apache Buildr", "CloudStack", "Apache CloudStack", "CouchDB", "Apache CouchDB", "Derby", "Apache Derby" ,"Flex", "Apache Flex", "Apache HTTP Web Server", "Jackrabbit", "Apache Jackrabbit", "Karaf", "Apache Karaf", "Lucene/Solr", "Apache Lucene/Solr", "Maven", "Apache Maven", "Mesos", "Apache Mesos", "OFBiz", "Apache OFBiz", "Struts", "Apache Struts", "Tomcat", "Apache Tomcat", "Traffic Server", "Apache Traffic Server", "Zookeeper", "Apache Zookeeper", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

HBase on Azure: Import/Export snapshots from/to ADLS (Apache Software Foundation Blogs)

by Apekshit Sharma, HBase Committer.

Overview

Azure Data Lake Store (ADLS) is Microsoft’s cloud alternative for Apache HDFS. In this blog, we’ll see how to use it as backup for storing snapshots of Apache HBase tables. You can export snapshots to ADLS for backup; and for recovery, import the snapshot back to HDFS and use it to clone/restore the table. In this post, we’ll go over the configuration changes needed to make HDFS client talk to ADLS, and commands to copy HBase table snapshots from HDFS to ADLS and vice-versa.

Introduction

“The Azure Data Lake store is an Apache Hadoop file system compatible with Hadoop Distributed File System (HDFS) and works with the Hadoop ecosystem.”

ADLS can be treated as any HDFS service, except that it’s in the cloud. But then how do applications talk to it? That’s where the hadoop-azure-datalake module comes into the picture. It enables an HDFS client to talk to ADLS whenever the following access path syntax is used:

adl://<Account Name>.azuredatalakestore.net/


For eg.
hdfs dfs -mkdir adl://<Account Name>.azuredatalakestore.net/test_dir

However, before it can access any data in ADLS, the module needs to be able to authenticate to Azure. That requires a few configuration changes. These we describe in the next section.

Configuration changes

ADLS requires an OAuth2 bearer token to be present as part of request’s HTTPS header. Users who have access to an ADLS account can obtain this token from the Azure Active Directory (Azure AD) service. To allow an HDFS client to authenticate to ADLS and access data, you’ll need to specify these tokens in core-site.xml using the following four configurations:


<property><name>dfs.adls.oauth2.access.token.provider.type</name><value>ClientCredential</value></property>

<property><name>dfs.adls.oauth2.refresh.url</name><value>xxx</value></property>
<property><name>dfs.adls.oauth2.client.id</name><value>xxx</value></property>
<property><name>dfs.adls.oauth2.credential</name><value>xxx</value></property>


To find the values for dfs.adls.oauth2.* configurations, refer to this document.


Since all files/folders in ADLS are owned by the account owner, it’s ACL environment works well with that of HDFS which can have multiple users. Since the user issuing commands using the HDFS client will be different than what’s in Azure’s AD, any operation which checks for ACL will fail. To workaround this issue, use the following configuration which will tell the HDFS client that in case of ADLS requests, assume that the current user owns all files.


<property><name>adl.debug.override.localuserasfileowner</name><value>true</value></property>


Make sure to deploy the above configuration changes to the cluster.

Export snapshot to ADLS

Here are the steps to export a snapshot from HDFS to ADLS.

  1. Create a new directory in ADLS to store snapshots.

$ hdfs dfs -mkdir adl://appy.azuredatalakestore.net/hbase


$ hdfs dfs -ls adl://appy.azuredatalakestore.net/

Found 1 items

drwxr-xr-x   - systest hdfs          0 2017-03-21 23:43 adl://appy.azuredatalakestore.net/hbase


  1. Create the snapshot. To know more about this feature and how to create/list/restore snapshots, refer to HBase Snapshots section in the HBase reference guide.

  2. Export snapshot to ADLS

$ sudo -u hbase hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot <snapshot_name> -copy-to adl://appy.azuredatalakestore.net/hbase


[Output]

17/03/21 23:50:24 INFO snapshot.ExportSnapshot: Copy Snapshot Manifest

17/03/21 23:50:48 INFO snapshot.ExportSnapshot: Export Completed: snapshot_1


  1. Verify that the snapshot was copied to ADLS.


$ hbase snapshotinfo -snapshot <snapshot_name> -remote-dir adl://appy.azuredatalakestore.net/hbase

Snapshot Info

----------------------------------------

  Name: snapshot_1

  Type: FLUSH

 Table: t

Format: 2

Created: 2017-03-21T23:42:56


  1. It’s now safe to delete the local snapshot (one in HDFS).

Restore/Clone table from a snapshot in ADLS

If you have a snapshot in ADLS which you want to use either to restore an original table to a previous state, or create a new table by cloning, follow the steps below.

  1. Copy the snapshot back from ADLS to HDFS. Make sure to copy to ‘hbase’ directory on HDFS, because that’s where HBase service will look for snapshots.

$ sudo -u hbase hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot <snapshot_name> -copy-from adl://appy.azuredatalakestore.net/hbase -copy-to hdfs:///hbase


  1. Verify that the snapshot exists in HDFS. (Note that there is no -remote-dir parameter)

$ hbase snapshotinfo -snapshot snapshot_1


Snapshot Info

----------------------------------------

  Name: snapshot_1

  Type: FLUSH

 Table: t

Format: 2

Created: 2017-03-21T23:42:56


  1. Follow the instructions in HBase Snapshots section of HBase reference guide to restore/clone from the snapshot.

Summary

The Azure module in HDFS makes it easy to interact with ADLS. We can keep using the commands we are already know and our applications that use the HDFS client just need a few configuration changes. What what a seamless integration! In this blog, we got a glimpse of the HBase integration with Azure - Using ADLS as a backup for storing snapshots. Let’s see what the future has in store for us. Maybe, a HBase cluster fully backed by ADLS!