Blog

Solving Memory Issues with Custom Crawlers

by Nick Drane

Spantree recently worked with a company whose goal is simplify regulatory compliance. One of their core value propositions to identify, retrieve, and notify their clients of new regulatory documents published on the websites of organizations like the FTC and the SEC. This company needs to notify customers of new regulatory documents the day they are published. This company called Spantree because the critical part of their infrastructure that coordinated the identification and retrieval of these documents was failing. This is the story of how we made that system an order of magnitude more scalable and simultaneously improved its reliability.

  • crawler

GraphQL Schema Stiching: Modularizing GraphQL Schemas

by Nick Drane

Many introductory GraphQL tutorials put all the graphql type definitions in a single GraphQL file. This works fine for small applications but quickly imposes maintenance costs as the size of that file balloons to hundreds or thousands of lines. Most projects will want to split their GraphQL type definitions across multiple files. This process is the topic of this blog post. The complete code can be found on GitHub.

  • graphql
  • schema stiching
  • modular code

Field Notes From Strange Loop 2018

by Mari Gallegos

It is no secret that Spantree has long been a fan of Strange Loop. For the past seven years, we’ve paused the company and descended on St Louis with a full force of Spantreers to take part in this progressive and compelling conference.

  • conference
  • strange loop

Celebrate Halloween with React Final Form

by eileen

As a dedicated fan of costume parties, I started the countdown to Halloween a long time ago. This year a group of friends and I will be dressing up as different characters Meryl Streep has played in her career. The winner of the costume party gets a (fake) Oscar!

  • react
  • react final form
  • front-end
  • meryl streep

From Flask to Serverless

by dan

At Spantree we work with a variety of clients, from startups to enterprises. Regardless of their company size, all of our clients want to save as much as possible on IT infrastructure costs. Follow along as we take a traditional backend Flask app, and use it to explore serverless architectures and the Serverless framework.

  • flask
  • python
  • serverless
  • event driven
  • architecture
  • aws
  • cloud

Spantree Selects: Strange Loop 2017

by kevin

Every year since I've joined, Spantree has gone to Strange Loop as a company. It is a wonderful experience that I would highly recommend, but with all the amazing talks, I miss several that interest me. Deciding what to watch the week after is a fun tradition that I outsource to all my coworkers, and here are their thoughts!

  • strange loop
  • talks
  • spantree selects

Three Tips for Modern SQL

by Roberto Guerra

Relational Database Management Systems and SQL have gotten a bad rap over the past 10 years or so, and some of it deservedly so. But we tend to overlook that the SQL standard is being continuously improved and that it has become even more powerful than it was 20 years ago. In fact, SQL is now considered to be turing complete. In this blog post I'll try to shed some light on three of these features that were not available when I first started working with SQL in the late 90s and that are overlooked in lots of tutorials for beginners. All the example queries written here were tested against Postgresql 94.

  • sql
  • database

11 Tips for Strange Loop First-Timers

by dan

First Time going to Strange Loop? Me too! In preparation for Strange Loop next week I gathered advice from around the Spantree office, and put it into one handy list. If this is your first time going to the conference or your ninth, I hope you find some great ideas for how to maximize your conference this year.

  • conferences
  • strange-loop

What is JSON? Assumptions and "basic knowledge"

by jonathan

In consultanting, I'm constantly interacting with people from very different technical backgrounds. Sometimes it's a PHD with more understanding of distributed systems theory than I could hope to have, sometimes it's a non-technical C-suite executive, sometimes it's a 30-year veteran that at some point found themselves stuck in a certain era of technology.

  • json
  • people
  • consulting

Handling Kotlin Exceptions with Kategory – A Functional Approach

by Roberto Guerra

Kotlin has excited us at Spantree since we've first heard about it. As advocates for alternative languages on the JVM, we love the benefits it's brought to client and internal projects. The biggest benefit Kotlin has provided in the code I've written with it is the way it can handle exceptions and errors.

  • kotlin
  • kategory
  • functional programming

Bootstrapping the Web with Scala Native (Part 1)

by richard

Scala Native has advanced significantly since I last wrote about it, including a new garbage collector and substantially broader coverage on POSIX and JDK bindings. In terms of use cases, it's making rapid progress on command-line tools like scalafmt, and even scalac itself.

  • scala
  • scala native
  • web

Intellij Fonts for Pairing and Presentations

by kevin

I've always been a big fan of Intellij for editing most anything on the JVM. But, I've sat through and given plenty of presentations where the font was wrongly sized or code was hard to read, and the presentation suffered because of it. Throughout this post, we'll get our editor set up, with shortcuts, to effectively switch for pairing, presenting, and personal development.

  • intellij
  • ide
  • jetbrains

Simple Command Line Tools with Scala Native

by kevin

I've been enjoying writing Scala lately, but there are some legitimate cases where being on the JVM just isn't the right fit. Despite liking Scala more than alternatives, I've had to turn to languages like Go and Node.JS to write simple command line tools. With the introduction of Scala Native, that all changes.

  • scala
  • scala native
  • cli

Tiny Docker Images for Scala Native with Multi-Stage Builds

by richard

I've been really excited by the rapid progress of Scala Native since its initial release just a few months ago. As a systems-oriented Scala hacker, I'm eager to use my favorite language for small, standalone tools, without some of the downsides of the JVM.

  • scala
  • scala-native
  • docker
  • linux
  • compilers

Easy Score Testing with OptaPlanner 7

by kevin

OptaPlanner is a great tool to solve planning problems, such as meeting scheduling, shift assignments, and more. However, these are often domains where people have grown to trust humans over computers. I've personally encountered a good bit of skepticism when I claim that a computer can solve scheduling problems better than a trained human. Whenever there may be doubt about the validity of a computer's solution, it's important to be able to prove that things are working properly.

  • testing
  • optaplanner
  • spock

Git Tip: Cherrypicking out of a bind!

by Roberto Guerra

There are times I forget to create a new branch when I'm working on a new feature and only notice the mistake when push my first commit, but just today one of my co-workers (Jonathan) showed me some neat git tricks that helped me correct my mistake. In this article I'll show how to use cherrypicking to get out of a bind.

  • git

Deep Learning On Demand with Spark, Akka, and GraphQL

by richard

Machine learning can be an opaque undertaking. As algorithms grow more and more complex, we need specialized tools to answer questions like, "Why did the computer think this was spam?" or "Why did your service recommend this movie to me?" In my last post, I wrote about a model that was elegantly straightforward: the information content of an item is the positive logarithm of its frequency. Newer models, however, are far less transparent: most famously, the output of Google's DeepDream pattern-recognition software produces phantasmagoric images that are at once completely fascinating and entirely unfathomable.

  • scala
  • spark
  • akka
  • graphql
  • sangria
  • machine learning

Grooving with Toggl and Slack

by

Grooving with Toggl and Slack

  • groovy
  • toggl
  • slack
  • intern

Customizing Dashing Dashboard 101

by Amanda

As an intern at Spantree, I, Amanda Wang, decided to work with Dashing and create a dashboard for Spantree. I wasn't really sure how we wanted the dashboard to turn out, but I'm glad that all these widgets work. At first, Dashing started off very confusing for me mostly because Dashing didn't have enough documentation for me to truly understand how Dashing works. Looking up for solutions and explanations for Dashing became a hassle. Some people use Dashing, but fewer people have created clear documentation on their code.

  • ruby
  • dashing
  • intern
  • dashboard

Clojure project with a corporate maven repository

by Sebastian Otaegui

Clojure is great and it has ton of features that allow developers to increase productivity. With this post we intend to show how to configure clojure to use a repository manager as proxy and how to setup the clojure project to deploy JAR files for shared usage.

  • leiningen
  • clojure
  • nexus
  • sonatype

Useful Leiningen Plugins

by kevin

Clojure has a great development flow, but a few tools aren't included by default. By adding a few dependencies and plugins to ~/.lein/profiles.clj, you can make your development workflow smoother, quicker, and more effective.

  • clojure
  • leiningen

Readable Clojure Through Threading

by kevin

Clojure, like many Lisps, sometimes struggles to attract newcomers who claim it's "hard to read". Any paradigm shift requires time, but I myself struggled to read Clojure I had written early on. Nested parentheses and REPL-driven development made the result come quickly, but it often looked ugly. However, the thread operator -> and all of its cousins fix that.

  • clojure

Finding Surprises in your Data with Spark

by richard

In this post, I'll demonstrate my all-time favorite natural language processing (NLP) trick: "surprisal", a statistical measure of the unlikeliness of any event, which can be applied to just about anything that you can count. Scala is a wonderful language for this sort of data crunching, largely because of Apache Spark, a powerful distributed computing framework. For this post, I'll be using Apache Zeppelin as an interactive, web-based shell around Spark. If anyone's interested in following along, I encourage you to download a Zeppelin binary distribution and have fun!

  • scala
  • spark
  • data
  • text

Quick and dirty docker deployment in AWS

by Sebastian Otaegui

At Spantree we periodically do hackathons, sometimes for internal projects, at other times for non-profit organizations. Every year on Martin Luther King, Jr. Day we held our annual hackathon for social good, which is always exciting.

  • docker
  • aws
  • 5-minute-read

Using Google Container Registry with Marathon

by jonathan

We recently were tasked with setting up a container management solution on Google Cloud Engine (GCE). After standing up Mesos and Marathon on CoreOS, our initial tests worked fine and we deployed several docker apps just fine to Marathon. We ran into a snag, however, when switching from a public registry to a private Google Container Registry.

  • mesos
  • marathon
  • GCR
  • docker

5 Things About Scala (that I wish I knew 6 years ago)

by richard

Like a lot of developers, I first heard about Scala 6 years ago, when Twitter ported their failure-prone Ruby backend to a language I'd never heard of, that looked kind of like Python, but performed like Java or C++. If I had to give a one-sentence elevator pitch for Scala, I'd still say what I read back then: Scala is a statically compiled language with all of the performance, safety, and tooling of Java, without the verbosity, so that a developer can write Scala code as fast as Ruby, Python, or Javascript.

  • scala
  • java
  • functional
  • safe
  • lazy

Using Jekyll as a Content Management System

by maria

Starting out with Jekyll seemed intimidating - I’d be working in a text editor and running code from the command line - but it doesn’t get much more difficult than that.

  • jekyll
  • CMS
  • Liquid
  • markdown

Websockets with Clojure

by kevin

As we continue to evaluate Clojure, new avenues present themselves. In the past, highly concurrent processing at Spantree has been done with node.js or a JVM solution on top of Jetty. While these work, I never found them particularly easy or enjoyable to use.

  • clojure
  • websockets
  • http-kit

10 Things to Know About Docker

by Cedric Hurst

It’s possible that containers and container management tools like Docker will be the single most important thing to happen to the data center since the mainstream adoption of hardware virtualization in the 90s. In the past 12 months, the technology has matured beyond powering large-scale startups like Twitter and Yelp and found its way into the data centers of major banks, retailers and even NASA. When I first heard about Docker a couple years ago, I started off as a skeptic. I blew it off as skillful marketing hype around an old concept of Linux containers. But after incorporating it successfully into several projects at Spantree I am now a convert. It’s saved my team an enormous amount of time, money and headaches and has become the underpinning of our technical stack.

  • docker
  • tips
  • dockerfile
  • devops
  • infrastructure
  • infra

Tips for writing Dockerfiles

by Sebastian Otaegui

We've been using docker on our projects recently to ease development and deployment processes. Here are a few tips based on what we learned building and maintaining docker infrastructure for production.

  • docker
  • tips
  • dockerfile
  • devops
  • layers
  • infrastructure
  • infra

Accelerate77: Spantree's Hackathon for Social Good

by kevin

Since its launch in 2009, Spantree’s team has shared a passion for volunteerism. As the second largest non-profit market in the country, Chicago is home to many worthy organizations and with a diverse range of personal interests and passions among us, we felt a unanimous pull to use our skills and talents to service our community as a team. With the spirit of social justice in mind, it seemed fitting to dedicate our community building efforts toward a special project each year on Dr. Martin Luther King, Jr. Day. Thus began our annual Hackathon for Social Good.

  • accelerate77
  • culture

Parsing spreadsheets with Clojure

by Roberto Guerra

Reading from an Excel file is surprisingly easy in clojure. We'll see an example in groovy and then compare it with one in clojure.

  • clojure
  • spreadsheet

Games, Expectations, and Anticipating Users

by kevin

I love games, and I love game theory. Recently Spantree started doing lightning talks, which gave me the chance to share some of my favorite group "games".

  • games
  • client relations
  • UX
  • pirates

Functional Testing with Sparrow

by jonathan

Always interested in new approaches to testing on the front end, my ears perked when a functional testing framework called Sparrow.js popped up in my twitter stream. In essence, Sparrow allows you to run Selenium style tests defined with Jasmine style syntax. The immediate appeal for me was that front-end developers with experience in Jasmine could easily side-step into using Sparrow to write tests for broad user interactions that span across multiple pages. I played around with it for an afternoon, and here's what I learned.

  • javascript
  • testing
  • functional testing
  • jasmine
  • sparrow

Data Processing with Clojure

by kevin

On every professional software engineering job I've had, data transformation always played a role. Most recently, at Spantree we needed to get a massive corpus of synonyms into the file format readable by Elasticsearch.

  • clojure
  • data
  • elasticsearch
  • ETL
  • partition-by

New cool feature in s3cmd

by Sebastian Otaegui

Amazon S3 is an online file storage web service offered by Amazon Web Services. Amazon S3 provides its service through REST, SOAP or Bittorrent web services.

  • s3
  • aws
  • devops
  • s3cmd
  • short
  • sysadmin

Ratpack Templates

by Luke Daley, Roberto Guerra

Rendering html templates in Ratpack is very straightforward. We'll use the built-in templating features of Ratpack's Groovy module to render very simple templates, to complex templates with sub-templates; and templates with dynamic data.

  • ratpack
  • groovy
  • templates

Basics of Ratpack Handlers

by Roberto Guerra

Ratpack is a "simple, capable, toolkit for creating high performance applications" on the JVM. It was originally inspired by Sinatra but it has taken a life of its own with very interesting concepts. It is virtually a treasure cove of neat tricks with Groovy and its @CompileStatic feature. It has not reached 1.0 yet, and it is under heavy development, with releases the 1st of every month.

  • ratpack
  • groovy

Monit: The Quick Fix

by kevin

Ideally, in a production system, everything works perfectly. Services never mysteriously crash, free memory is constantly available, and CPU load rarely spikes above 50%. Unfortunately, this is not always the case.

Backbone Repository Pattern

by Roberto Guerra

Backbone.js does a great job of providing structure to complex front-end applications, but oftentimes we find we need to do more to further abstract domain logic so it does not depend on the UI layer or the backend. In this article, we talk about how to apply the repository pattern to encapsulate interactions with the backend.

  • backbone
  • promises

Autogenerating Fake Users with Name Genius

by kevin

When developing most applications, there is some concept of a User, Customer, Person, etc. If you are making an application for people to use, then you should be able to represent those people.

Provisioning A PostGIS Instance With OpenStreetMap Data

by Gary Turovsky

Sometimes there is no amount of styling you can apply to Google Maps to make it look the way you want. To really control what is on the page, you will need to create your own maps and serve them up using your own tile server. But where do you start?

  • openstreetmap
  • postgres
  • databases
  • gis

Elasticsearch vs Google Search Appliance

by kevin

With the rise in popularity of Google Search Appliance and Elasticsearch, many companies are interested in the best search solution for them. At Spantree, we've been asked several times if the GSA or Elasticsearch is a better solution, and we have decided to cover the various strengths and weaknesses. While both are solid search tools which can meet most needs, they specialize in very different domains. To understand the strengths and weaknesses, it's important to note what the general philosophy of each technology is.

  • elasticsearch
  • gsa
  • google-search-appliance
  • search