We believe that communities of knowledge, culture and innovation are more vital and energetic when outcomes are shared, enabling others to more effectively build on past work. As active members of a global software community, we think it’s useful and important to share some of our knowledge and expertise by making contributions to open source and other community-based software efforts.
We also participate in FOSS conferences – it’s great to meet and collaborate with peers and share our experience through talks and workshops.
FOSS4G Boston 2017
We’re proud to sponsor FOSS4G Boston 2017 and we’re excited to participate throughout the week. We’re running a workshop, presenting talks, and attending training sessions.
Reach out via the FOSS4G app or through email if you’ll be in Boston next week and want to meet up!
Workshop: Analyzing large raster data in a Jupyter notebook with GeoPySpark on AWS (SOLD OUT)
Rob Emanuele, Vice President of Research
Tuesday, Aug. 15, 2017
8:00AM – 12:00PM
This workshop will focus on doing analytics on large scale raster data in python through a Jupyter notebook.
If you work with raster data like satellite imagery or elevation data, you might have run into the “Big Data” problem: analyzing and transforming very large sets of rasters proves to be a difficult challenge. GeoTrellis is a library for processing and analyzing large raster data with Apache Spark. It is written for the Scala language, which does not have as large of a following, particularly in the geopsatial developer community, as Python. To account for this, Spark, which is also written in Scala, has python bindings (PySpark) that allow jobs to be written in Python and executed against a Spark cluster. The GeoTrellis developers have followed PySpark’s example and have created the GeoPySpark project. This project opens up to Python developers the ability to process large raster data with the power of GeoTrellis and Spark, and establishes a framework for encorporating Python usage with other Spark-based geospatial projects such as GeoMesa. With GeoPySpark, pythonistas have access to the power of GeoTrellis, while still being able to use the tools they are familiar (i.e. numpy). All this functionality is accessable through Jupyter notebooks, which lend themselves to be simple and effective ways to interact with Spark clusters.
Attendees should leave this workshop having learned about Spark, GeoTrellis, Jupyter and Amazon Web Services, and be armed with the knowledge necessary to take on Big Raster Data problems using Python.
The Story of Open Source Business Models at Azavea
Robert Cheetham, Founder & CEO
Wednesday, Aug. 16, 2017
11:00AM – 11:30AM
Open source software is often framed as an alternative to commercial software by both its advocates and those that see it as competition. We think that framing is a mistake. Few open source projects have become successful without support from commercial firms, and there are now few commercial software products that don’t rely upon open source components. Azavea, a B Corporation, was founded in 2000. Over the past several years, Azavea has become a leader in using and developing open source geospatial technology, but this has not always been the case. In fact, Azavea didn’t use any open source technology for the first several years of its life. In this talk, the founder and CEO, Robert Cheetham, will describe the evolution of the firm’s use and contributions to open source geospatial technology including how its 10% R&D program led to its first open source work; the evolution of the GeoTrellis project; how it strikes a balance between open source code and commercial concerns for each of its three SaaS products; current priorities for open source projects; and how Azavea navigates its partner relationships with commercial firms.
Python Raster Processing on Serverless Architecture
Matt McFarland, Vice President of Engineering, Professional Services
Wednesday, Aug. 16, 2017
4:30 PM – 5:00 PM
Using python and some popular geospatial libraries, it’s easy to write code to do raster analysis against imagery and thematic data. But what if you want to expose that code as a web service? The first step is to start provisioning a server and the second is to worry about how you’re going to scale and manage traffic. What if you could just focus on your code?Within Amazon Web Services, there are a number of ways to launch a geospatial web service. You can host the service on a traditional EC2 instance, containerize it and run it from ECS, or use the “serverless” compute environment provided by Amazon Lambda. Because raster data processing is typically seen as a heavyweight operation, Lambda may not seem like a smart choice. In this talk, I’d like to convince you otherwise.Lambda is a service that runs functions in response to events, and automatically scales the resources required to respond to them. You are only billed for the time your function is actually executing. With such a compelling offering comes a host of obstacles, including significant resource limitations. Using the python runtime, I’ll show how to write and deploy python code with libraries such as Rasterio, GDAL, and NumPy within the constraints of Lambda against national scale rasters. In addition, I’ll show how you can wire up those Lambda functions to the Amazon API Gateway and start serving your results over a REST endpoint, all at scale and without managing a single server.
Other staff you’ll see at FOSS4G
- Ross Bernet, GeoTrellis Project Manager
- Chris Brown, Senior Software Engineer
- Dan Ford, Community Ambassador
- Jenny Fung, Software Engineer
- Esther Needham, Data Analytics Project Manager
Check our GitHub for a list of tools and libraries, as well as our work on the GeoTrellis, Raster Foundry, OpenDataPhilly, and DistrictBuilder projects.