SuperPumpup (dot com)
General awesomeness may be found here.

Software

Articles on Software

12 January 2017

Protecting Long-running streams

Objective

We have long-running entities that we want to operate on with confidence.

Constraints

  • We need to be able to restart a system in a reasonable timeframe * We really want snapshots in-band (ES)

Canonical question

We have an account with 100k events (Hold, Capture, Release, Deposit) to process. Because of distributed systems, we will receive "duplicate" commands.

Solutions examined

Re-projecting all history to build state

Keeping 'rich' snapshots

  • All 100k transaction ids (UUID, 16 bytes each gives ~ 1.6 mb)
  • The 1.6 mb is too big to put "in-band"
  • This grows monotonically with account usage
  • "Hotspots"

Expiring old (vector) transactions

  • Only keeping, say, the last 1k transactions (16kb is in line with the upper limit we have seen of ~25kb for reasonable performance for ES) in the snapshot
  • This is fine under normal use, but in a "bursty" scenario, your wall time for margin-of-error goes down just when your system comes under load (and presumably more failure-prone)
  • Busy accounts may generate more than 1000 transactions in a single payroll run

Expiring old (wall clock) transactions

  • To do this, you need to also store a timestamp in addition to the UUID in memory, making the per-record size more than 20 bytes
  • On a bursty day, you wind up needing more space, and each record takes more space, so the general snapshotting performance deteriorates

Keeping transaction IDs in small, "external" streams

  • The first implementation we came up with, we could not figure out how to get integrity assured
  • The "command" that gets replayed writes that it was received (transaction-, event 0), this triggers a handler to try to process the request (does an NSF check, then a hold or reply that it's not-held), which either writes a hold in the account stream and then a "held" in the transaction stream, or writes a "NotHeld" in the transaction stream
  • Handler of the transaction stream dispatches a reply to the requesting process

The problem here is subtle.

If the system crashes here:

And handler 1 picks up event A, it will need to check to see whether the account stream has the event already. However, the account stream has a million events. Searching for that event is problematic. If the stream has 1mm events already, scanning for that one event is prohibitively expensive.

ES has a provision for an "Event ID" A unique identifier representing this event. This is used internally for idempotency if you write the same event twice you should use the same identifier both times. So if you can generate an Event ID (GUID) deteriministically, then actually you would be ok. However, that couples more tightly to an ES implementation detail than we would prefer.

But if we write the version number of the account projection in the "HoldRequested" event, then we know that the subsequent "Held" event must be in the events following that, and we can search only that space to ensure that our work is not repeated.

So in the end, we wind up with a messaging pattern that looks like this:

Account Messaging

You'll see that to process a Hold command, we write three events within the Account Service, and then communicate that back out by writing another.

To Capture the held funds, we also write three messages.

One thing that is striking about this pattern is that the handler that is handling Hold events projects entities from the Account stream, does business logic, and writes to the Account stream. These processes are coupled, for all intents and purposes - which is reasonable. They are coupled as a business process, so their implementation is coupled as well. Decoupling is not always necessary (or desirable).


08 January 2016

React Native & Jest Testing

Objective:

Have a way of testing my RN components with Jest that Doesn't Suck.

Background:

There exists a reasonable corpus of knowledge around testing React components with Jest. React is relatively new, Jest is newer, and though each have documentation, their interop has some nuance. Then, React Native gets added to the mix with a whole 'nother set of nuances and things kind of get crazy. So I spent a few days trying to figure out something reasonable.

This example is of some components whose render functionality will be tested. The project uses Redux, so there is not much store interaction, and a lot of passing of big prop trees through components. It shouldn't matter too much to the testing, but knowing that may give more context.

High Level Strategy:

We're going to use ReactTestUtils' shallowRender function to generate a "one" (really two) layer virtual DOM tree to inspect, then do some assertions on that. The object we are inspecting never actually gets written to HTML, and we are not doing "real" HTML DOM test/inspect. Rather we are just inspecting the objects that React Native is going to paint to a screen (which, being RN, is not a web browser), and can make sure our logic is working based on that.

In order to do this we need a convenient way to "render" a component to something we can inspect, and we need some tools for making assertions about that result.

Execution

Composing

Setup

I'll start with the "header" of my test file.

'use strict';

jest.autoMockOff();

const shallowHelpers = require('react-shallow-renderer-helpers');
const findMatchingType = require('./findMatching').findMatchingType;
const objectAssign = require('object-assign');

import React, { View, Text } from 'react-native';

const NewLoan = require('../NewLoan');
const NewLoanForm = require('../NewLoanForm');
const PrepareSchedule = require('../PrepareSchedule');
  • jest.autoMockOff() - is important because if I don't have that, one of the components in my NewLoanForm object I include later will blow up. It's a bit of a liability, but not something that I've gone after fixing yet.
  • const shallowHelpers = require('react-shallow-renderer-helpers'); - this pulls in a bunch of React functionality for testing component trees (which is totally useless to us since RN components wind up being structured pretty differently from React components), but does pull in some nice functionality wrapping the shallowRender function.
  • var findMatchingType = require('./findMatching').findMatchingType; - this is a tool I built for doing assertions in the next section.
  • const objectAssign = require('object-assign'); this is useful for having our "default" props and being able to add new ones per-test. You could get similar functionality from lodash or similar, but I would prefer to keep that dependency out of my projects until necessary.

  • import React, { View, Text } from 'react-native'; - notice a couple of things here. First, this is the ES6 import syntax, where everything else uses require. This is because 1) I much prefer this syntax and 2) if I try using this syntax for the other lines, the autoMockOff(); won't work and these would be mocked - even if I ask for them not to be. This is due to JS hoisting, and may be fixed in the near future. The module that's being imported is a mock that will be detailed elsewhere.

  • const NewLoan, const NewLoanForm, const PrepareSchedule - these are not mocked and required in for doing assertions. I'm wanting to assert that when I render NewLoan with certain props, then it will have a NewLoanForm as a child, whereas if I give it different props, it will render a PrepareSchedule as a child. It may be nice to just do the comparison with string names, but this is good for now.

Render

It's nice extract a reusable render method for your components that you will be testing in different state, with different props, etc.

describe('NewLoan', () => {
  let newLoan;

  function renderNewLoan(props) {
    const defaultProps = {
      isFetching: false,
      loans: {
        preparedLoan: null,
        preparedSchedule: null,
        form: {
          fields: {},
          isFetching: false
        }
      }
    };
    const testProps = objectAssign(defaultProps, props)

    const shallowRenderer = shallowHelpers.createRenderer();
    shallowRenderer.render(() => <NewLoan {...testProps}/>);
    const output = shallowRenderer.getRenderOutput();

    return {
      props,
      output,
      shallowRenderer
    };
  }
  // Your assertions will go here
}

So I have a set of defaultProps that the NewLoan view will generally depend on to make its rendering decisions.

This defaultProps object then gets objectAssignd (merged, basically), with the provided props.

A renderer is constructed, the component rendered into it, and the output returned in an object.

Note that

return {
  props,
  output,
  shallowRenderer
};

Will really return an object like:

return {
  props: props,
  output: output,
  shallowRenderer: shallowRenderer
};

(thanks ES6 - and Obama).

Now let's render a component:

it('should display the NewLoanForm if there is no loan prepared', () => {
  const testProps = {};

  newLoan = renderNewLoan(testProps);
  const { output } = newLoan;
});

Sweet! We have something to look at. What is this output thing? console.log tells us:

Object {
  '$$typeof': Symbol(react.element),
  type: [Function: View],
  key: null,
  ref: null,
  props:
   Object {
     style: Object { flexDirection: 'column', flex: 1, width: 300, marginTop: 30 },
     children: [ [Object], [Object] ] },
  _owner: null,
  _store: Object {}
}

What I'm going to be asserting on generally is the type, props, and children - more likely the props of children when changing component state (though since this is Redux, MOST MOST MOST state should wind up being in the main state object and components only care about props & actions).

I was so excited about having this to assert on that my first tests when I got to this point were things like:

expect(output.props.children[0].props.children[0].type.displayName).toEqual('NewLoanForm');

I pasted that into our development Slack channel and immediately lost the respect of most of our engineering team. Well, at least that's my deepest fear. I'm sure they still love me. There were some (very correct) criticisms, and it was clearly time to move on to step 2. Making decent assertions.

Asserting

Gotchas

I'm not sure where this fits in the flow of this post, but I should mention something horrifying that I found. If you look at the assertion: expect(output.type.displayName).toEqual('NewLoanForm'); you may think, "Cool, it seems my React components have a property called DisplayName within their type that I can test against. Thanks React!" And you may even test against that for a little while and it will work. But then you will do that for a new component and get this result:

- Expected: undefined toEqual: 'NewLoan'

Wut?

It turns out that if you declare a component using the syntax: var NewLoan = React.createClass({}), JSX will helpfully add a property displayName to the component. If you do: export default class NewLoan extends Component {}, then no such luck. You have no displayName. I would like that day of my life to figure all that out back please.

Ok, Assertions For Real

There is a nice module for doing assertions against React components generated by shallowRender - https://github.com/sheepsteak/react-shallow-testutils. Sadly, almost none of the matchers (isComponentOfType, findAllWithClass, etc.) work in RN because a RN component is pretty different from a React component. Its findAll function does work pretty well, though (and it seemed kind of magical until I looked through it and realized it was just like an interview-question-type tree traversal).

Fortunately, this is all just JavaScript, and you can make your own matchers. These are what I came up with:

find[All]MatchingType

This will look for an element that matches what you're looking for. Simple. The All variant returns the array, the non-all variant just blows up if you don't have 1 exactly.

export function findAllMatchingType(tree, match) {
  return findAll(tree, (el) => {
    const typeMatch = match.type ? el.type === match.type : true;

    return typeMatch;
  }
  );
}

export function findMatchingType(tree, match) {
  const found = findAllMatchingType(tree, match);
  if (found.length !== 1) throw new Error('Did not find exactly one match');
  return found[0];
}

It should be noted that the type attribute that you are looking at here is sometimes friendly, sometimes a big nasty function. If you do the export default class... syntax, it has one form:

function PrepareSchedule() {
  _classCallCheck(this, PrepareSchedule);

  _get(Object.getPrototypeOf(PrepareSchedule.prototype), 'constructor', this).apply(this, arguments);
}

Otherwise, it can be:

function (props, context, updater) {
  // This constructor is overridden by mocks. The argument is used
  // by mocks to assert on what gets mounted.

  if (process.env.NODE_ENV !== 'production') {
    process.env.NODE_ENV !== 'production' ? warning(this instanceof Constructor, 'Something is calling a React component directly. Use a factory or ' + 'JSX instead. See: https://fb.me/react-legacyfactory') : undefined;
  }

  // Wire up auto-binding
  if (this.__reactAutoBindMap) {
    bindAutoBindMethods(this);
  }

  this.props = props;
  this.context = context;
  this.refs = emptyObject;
  this.updater = updater || ReactNoopUpdateQueue;

  this.state = null;

  // ReactClasses doesn't have constructors. Instead, they use the
  // getInitialState and componentWillMount methods for initialization.

  var initialState = this.getInitialState ? this.getInitialState() : null;
  if (process.env.NODE_ENV !== 'production') {
    // We allow auto-mocks to proceed as if they're returning null.
    if (typeof initialState === 'undefined' && this.getInitialState._isMockFunction) {
      // This is probably bad practice. Consider warning here and
      // deprecating this convenience.
      initialState = null;
    }
  }
  !(typeof initialState === 'object' && !Array.isArray(initialState)) ? process.env.NODE_ENV !== 'production' ? invariant(false, '%s.getInitialState(): must return an object or null', Constructor.displayName || 'ReactCompositeComponent') : invariant(false) : undefined;

  this.state = initialState;
}

There doesn't seem to be anything in there that makes it seem like it shuold be the right component, but if I compare it with a "dummy" component:

var FooClass = React.createClass({
  render() {
    return <View />
  }
})

It does not match. Who'da thunk.

find[All]Matching

This matcher will try to match both the type of the element and the props. It's a trivial extension of the previous matcher:

export function findAllMatching(tree, match) {
  return findAll(tree, (el) => {
    const typeMatch = match.type ? el.type === match.type : true;
    const propsMatch = objectMatches(match.props, el.props);

    return typeMatch && propsMatch;
  }
  );
}

export function findMatching(tree, match) {
  const found = findAllMatching(tree, match);
  if (found.length !== 1) throw new Error('Did not find exactly one match');
  return found[0];
}

I have extracted those modules into a file on my filesystem, and will probably extract it out to a react-native-shallow-testutils or similar as it gets more robust.

Putting it all together

This is a full test case:

it('should display the NewLoanForm if there is no loan prepared', () => {
  const testProps = {};

  newLoan = renderNewLoan(testProps);
  const { output } = newLoan;

  const match = findMatchingType(output, <NewLoanForm />);

  expect(match).toBeTruthy();
});

There we go. The contents of match in this case are the NewLoanForm component instance that got rendered, but mostly all I'm checking at THIS level of testing is whether by "default" it will render a form. In other tests I vary the props to have it render other things (more loan info collection, confirmation view, submission successful view, etc)

Conclusion

This has been hard. I'm pretty good at the internet, and still, it's been very hard. I'm very grateful to Facebook for getting this great tech out there, and really look forward to watching these projects evolve.

Resources:

  • http://www.asbjornenge.com/wwc/testing_react_components.html
  • http://www.schibsted.pl/2015/10/testing-react-native-components-with-jest/ • I did not like that this uses the shallowRender technique, and tried to avoid it. However, @cpojer suggested that that is a good way to test components
  • https://jamesfriend.com.au/better-assertions-shallow-rendered-react-components
  • Reactiflux Discord - #flux

14 December 2014

Internalizing Dependencies

So I saw an interesting conversation today on Twitter between @DHH and @thijs this morning. They were discussing the NewRelic IPO and the subject got around to running ancient Rails versions, and the fact that a lot of the migration projects from Rails 2.3 have been driven in large part because of the movement of the surrounding ecosystem (read, gem updates).

Continue reading


12 August 2014

Mode Analytics Is Helping Me Become A Better Developer

Like most other "software writers" who primarily write Rails code I've met with, I don't have any formal CS training - or any formal training at all, really. I just started learning how to build things, which was hard and horrible in PHP and spectacularly easy in Rails. Love it or hate it, it's SO EASY to build systems in Rails when you have no idea what you're doing. I'll always argue that, and think it's amazing.

That said, I never learned SQL well...

Continue reading


12 August 2014

ReactJS and Vert.x

I'm building a new project in Vert.x and it's amazing to be able to separate concerns well. However, I "miss" having a Rails project to fall back on. Here I'll talk a bit about Vert.x and exploring using the ReactJS framework to build the UI.

Continue reading


25 February 2014

Agh! People are starting to use my App! Now what?

This is a slightly beefed up version of the lightning talk I gave at Austin on Rails last night - Feb 25, 2014.

I had intended to give this talk on things I've learn about infrastructure as I've been able to help stabilize and grow at a my last two jobs. I've had a lot of experience working on deployment infrastructure, from Dreamhost, to Rackspace, to Heroku, AWS, and now am mananging our infrastructure on Amazon OpsWorks using Chef (which I've been writing about recently).

However, as I thought about it, the problems that have been hindering our growth have not been as much server infrastructure related, as much as they were visibility-related...

Continue reading


13 February 2014

Deploying a Multi-Rails-App OpsWorks Stack

I've written before about how OpsWorks kind of pushes you into a weird architecture because of their default behaviors for applications in a stack.

Namely, the system tries to push all the applications on all of the layers, and chaos ensues.

But, it turns out I needed to finally move our various apps into one stack, and after a few days of poking, prodding, waaaaaaaaaaiting for machines, I got it working, so I thought I'd document it.

Continue reading


14 October 2013

OpsWorks and System Architecture

TL;DR Moving to OpsWorks has been very helpful for us to get our system infrastructure represented and reproducible as code. I don't think it's a great tool for us to stay on as things grow in complexity, or we just want better ease-of-use. It's a great stepping stone.

At OwnLocal, I've been involved in porting our application infrastructure from EC2 instances controlled by Capistrano to Amazon's new product, OpsWorks.

I was impressed by the Capistrano scripts, though some things caused us a lot of pain:

  • We did not always get consistent deploys
  • We had to hard-code in IP addresses in our deploy files
  • Server configuration was unreliable (some of our servers had customized configurations that were problematic to rebuild)

Understanding what I do about Opsworks has taken a lot of effort, and I'm certainly no expert (yet?) but we now have servers that spin up on weekdays as our system comes under load and down as the load abates without any intervention.

The first time I saw my load-based servers spin up was amazing

However, I have felt some tension between what I consider good system design (an ecosystem of small applications) and the way that OpsWorks "Stacks" work. Oddly, when you add an App to this server that is deployed using their friendly GUI, it adds the app to all the servers, and then a deploy command is triggered on ALL instances by default. Which is odd. There's no obvious way to deploy several different Rails app layers within the same stack (so communicating between service applications require hard-coding addresses).

This design makes it seem easier to leave all your logic in One Big App

I also have been frustrated that though I have to write my own cookbooks to run other than the most trivial vanilla "Stack", I do not have access to the most powerful Chef concepts like "Search". Plus, debugging is an absolute nightmare. Amazon does a lot of "crafting" of the settings used on lifecycle events that are not easily (possibly?) replicated in something like a Vagrant environment. Therefore the feedback cycle on customizing scripts is ungodly slow at times (tens of minutes for some apsects of the life cycle).

Now I'm at a place where I have the set of cookbooks written to be able to deploy my infrastructure as code (GO ME!) and OpsWorks helped me get there. However, I'm really looking for a justifiable way to jump ship and go to a hand-rolled cloud (maybe VPC) that runs Chef (probably paying Opscode) with more modern tools like knife and search. In fact, I'll probably achieve this the way all great change is done - slowly and deliberatly, one service at a time.

Goals for moving forward:

  • Search for nodes (or settings) in a sane way
  • Reconfigure multiple "stacks" more quickly (to install package the "OpsWorks way", I click a lot, which is "convenient", though it's a pain to do on two stacks - like Production and Staging)
  • Have better visualization tools
  • Have a better deployment tool ("cap -S staging deploy branch=test_something_dangerous")

So in all, OpsWorks is great. If you want an easy-ish way to move to infrastructure as code, I encourage you to check it out. However, it's not the be-all and end-all, and it won't save you from learning Chef. So if your application has some significant complexity already, it may not be right for you.


14 July 2013

Why A Unicorn Dies Every Time You Test A Private Method

And other thoughts

This weekend has been awesome in that I've eben able to spend a lot more time thinking about coding than coding.

I've been watching a lot of the Uncle Bob videos lately, and learned a ton. It's so neat to me to learn to be a better rubyist from watching videos about writing Java code. And really, it makes me understand that I'm neat trying to be a better rubyist, I'm trying to learn to become a better software engineer. I guess it's like how, as a chemist, reading things about physics or biology can make you a better scientist, and able to think more effectively about chemistry and the questions to be asked there.

In any case, I've been thinking a lot, lately, about this idea of "Functional Core, Imperative Shell", and about functional code in general. And I think that I've at once resolved (at least for myself) the "tension" between writing, say OO and functional code, and simultaneously come to understand the assertion:

Every time you test a private method, a unicorn dies

Well-factored OO code tends to read like a bunch of buck-passing. Drilling down a stack trace can be a bit maddening - I asked this controller to do this, but instead it punts out to this model, which hands it off to a collaborator, which hands it off to a service object, that finally does a bit of work.

It's great to have "thin controllers and fat models", and you want your objects to "Do one thing and do it well", so that makes sense. You need to find the proper place for that responsibility to live. But finally, that object is going to DO SOMETHING. Those other objects were sending messages back and forth to each other, and that's "good" object-oriented code. But then one of them will finally INVOKE A FUNCTION and do some work. Functions do work, objects send messages to each other. Duh.

But in ruby, both of those things look the the same:

def message_received_from_a_collaborator
  do_some_work
end

def do_some_work
  score += 1
end

And a lot of times, I've struggled to decide on a technique for organizing these in my files. I've seen the question asked in ruby mailing lists, and the only responses I remember have been, "scoff you should never have so many methods in a class that you would need to THINK about organizing them," and "I don't know, I've struggled with that myself, and have settled on alphabetically". Holymoly. And then, I have taken that and gone and organized my classes so that the method definitions were alphabetical. Ack!

But, finally learning about interface segregation, and Uncle Bob's thoughts on form, I have a better strategy for that. And I know that first of all, anything that is not part of that object's public API should be private, and that functions should appear basically sorted by layers of abstraction - so as you drill down into gory details, you get lower and lower in the file. Awesome! Much better than, "Keep splitting your classes until you don't even know where you started" or alphabetical.

And then this realization is tightly coupled with TDD and Sandi Metz's talks and writings about how you should test your objects especially that you should look at each object as a capsule with well-defined boundaries and you should only test as things cross those boundaries. She argues that of course you should not test private methods as a general practice, but that sometimes they will remain, and a handy practice is to mark the tests as, "delete this test as soon as it fails."

And this made some sense to me, but that hadn't really coalesced until today when I was watching the Uncle Bob video about TDD where he is literally turning an awesome tri-colored hat as he goes from red, to green, to blue (refactor) and it was while thinking of that that I realized that the reason you should never have a test of a private method (thus causing the unfortunate death of a noble unicorn) is that private methods are never born during the "red", or "green" phase of development. Private methods are born during refactoring. You wrote failing tests that all interact with the public interface of that object, and then got them green (possibly while working in that external-facing method) and then once you were green, you refactored well and segregated your interface.

I realize that a lot of the times that I've wanted to start testing a private method because testing the public methods that call that method is TOO HARD. Why is it too hard? Because I have a system that's too tightly coupled. This system is too tightly coupled and hard to test precisely because it was not test-driven to begin with. And despite the fact that over the years that this codebase before me where the developers ran rcov on every test pass and made sure that they had "good" test coverage, I've got a tightly-coupled, non-segregated mess.

I've encountered this a few times now (either walking into a new codebase or revisiting an old one) and tried "THE BIG REWRITE" enough times (though on thankfully small scales) to be convinced of Uncle Bob and Micheal Feathers (and probably loads of other smart people) positions that it's really a great idea most of the time and it seems to be that the only way to get from here to an excellent codebase that I'm very proud of is to start today, make no excuses, and be discpilined about every line of code I write from here on out.

That, to me, is Super Pumpup.


20 February 2013

Passwords

I read an article early this year about how "2013 is the year that < insert internet giant > is going to eliminate the password!" I read a number of follow-on articles about it, including one by my CabForward Coworker and was like, "Self, thank god. Awareness is raised and we can start being smart about passwords." I read another great article here. Sweet!

But then I saw this at Bed Bath & Beyond:

And a little part of me died.

Continue reading


05 February 2013

Translation In An API Client Gem

I went to the austin.rb meeting last night and enjoyed the talk. The speaker, @benhamill, asked for feedback and the thing that I've been thinking about since but is too long to tweet is his concept of Translation vs Transliteration of an API.

Briefly: most things you get out of an API are JSON objects, and accessors on JSON objects don't "feel" very "rubyish". For example:

Continue reading


20 January 2013

Building Antifragile Software

Imagine an application to run a business that is one method, 10,000 lines long. Worst thing in the world, right? Just thinking about such a think makes me shudder because you can look at it and know it's bad. But let's play with the idea a little bit.

So let's say the function of this application is to generate an invoice for a purchase with lots of line items. Cool, you may think if you like thinking about how software solves problems, an obvious thing that we could separate (to make this system a little more robust) is have the invoice built in two "phases". Phase 1 is preparing individual components (header, lines, footer) and then Phase 2 is composing them into an invoice.

Now you have a dramatically more robust system. You can fiddle with one part (say, the footer) feeling pretty safe that you are not going to change the "guts" of the invoice.

No problem there, let's go a little further -

Continue reading