Scoutapark.com
A story of pragmatic dockerization
Created by:
Dan Sheppard
Paul Czarkowski / @pczarkowski
For DevOps Days Austin
Welcome everybody, we're here to talk about scoutapark a very small one man startup with little more than a few big ideas and
how we're devopsing and dockering to help us get fulfill those big ideas.
Who are we ?
Dan Sheppard
Founder @ ScoutaPark
* Product/Project Manager by trade
* MBA candidate at Arizona State University
* Crazy about ScoutaPark
Who are we ?
Paul Czarkowski
Cloud Engineer @ Bluebox
* work at bluebox on private cloud as a service.
* we install, configure, and operate private cloud so that you don't have to.
Who are we ?
Paul Czarkowski
Snarky DevOps Princess @ Scoutapark
* in my spare time I ....
* docker - factorish
* chef community cookbooks - logstash, kibana
* help Dan out with ScoutaPark
WTF are we talking about?
What is ScoutaPark?
Business Challenges
How we use Docker
We want to talk about 3 main things -
1) What is scoutapark?
2) Business challenges and Obsticales
3) Our docker/devops story so far.
What is ScoutaPark?
ScoutaPark is a free to use website that connects parents, dog owners, and other park visitors to parks with the exact features they are looking for.
ScoutaPark is to public parks what Yelp is to restaurants.
We help patrons find public parks
Share reviews, pictures, and other information about the parks
Let's talk business
Goals
Make money
Improve park experience for park patrons
But mostly make money
* Ad revenue model
* help people connect with their community resources
* Selfish
* Explore other revenue avenues such as data aggregation
*Ice cream trucks
Obstacles
No Money
No Time
No Developers
* Fulltime student, 3 kids, stay at home wife
* One man startup...
* work a day job
Obstacle: No Money
Cheap/Free resources for startups
University programs
Incubator programs
Fiver, Craigslist, Elance, Freelancer
* Trello,Slack,bitbucket,RAX startup,etc
* Arizona State Startup School
* Edson Innovation Student Incubator
* Negotiate the fees
Obstacle: No Time
Be smart about choices, only implement critical features
Understanding the problem > Solving the problem
Present > Future
Devop all the things
* talk to potential customers
* minimum viable product
* Be aware of the future but solve for the present
Obstacle: No Developers
Pragmatic Language Choices ( PHP )
Assume everything needs explained
Cheap outsourced developers for initial prototypes/MVP
Always be on the lookout for the dev partner
* cakephp
* Careful with cheap devs
* Use mock-ups
* Document everything
* If we are interesting... let me know
* Enough business, let's devop.
* docker all the things!
* keep host OS clean, expect multiple languages etc as grow
* encourage 12 factor style apps from the beginning
* stateless, loose coupled, etc
* outsource data persistence to aaS like RDS
* containers are the future, be prepared
* php isn't your usual dockerable language
* needs a web server - apache/nginx
* Probably needs fastcgi style php engine phpfpm or hhvm
* Lock quirks of PHP into a box
Golden rules of Docker
that you should ignore
* They're not actually rules, just BS from ppl who think they know better
* multiple processes is A-OK
* don't need ssh in container, but might need docker exec
* may not need syslog but need central logging. if haproxy need syslog
* shouldn't touch a running container, just like shouldn't ever log into a server
* I do agree that you shouldn't blindly run community images ... but fuckit curl|bash
http://github.com/factorish
* Toolset I developed for making legacy style apps act like 12factor
* just a collection of things that help tied together
* basic premise, container becomes 'the app'. container needs to be 12factor not app inside.
* ties together tooling to help with config templating, process management, service discovery and more
* examples from basic python app to full ELK stack with SD, auto-clustering mysql/galera.
Dockerfile
FROM scoutapark/base
RUN \
DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y wget && \
wget -O - http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add - && \
echo deb http://dl.hhvm.com/ubuntu trusty main | tee /etc/apt/sources.list.d/hhvm.list && \
apt-get update && apt-get install -yq \
hhvm-fastcgi nginx runit mysql-client
RUN \
curl -sSL -o /usr/local/bin/etcdctl https://s3-us-west-2.amazonaws.com/scoutapark/etcdctl-v0.4.6 \
&& chmod +x /usr/local/bin/etcdctl \
&& curl -sSL -o /usr/local/bin/confd https://s3-us-west-2.amazonaws.com/scoutapark/confd-0.7.1-linux-amd64 \
&& chmod +x /usr/local/bin/confd
ADD . /scoutapark
CMD ["/scoutapark/docker/bin/boot"]
VOLUME /scoutapark
* hhvm, nginx, runit
* etcdctl, confd
* boot script - default env, call confd, start apps.
* volume - can also use same container as volume container.
Docker Compose
nginx:
build: .
command: /scoutapark/docker/bin/boot nginx
links:
- mysql
- hhvm
ports:
- "8081:8080"
hhvm:
build: .
command: /scoutapark/docker/bin/boot hhvm
links:
- mysql
ports:
- 9000
mysql:
image: orchardup/mysql
ports:
- "3306:3306"
* removed env settings to fit on screen
* just a few seconds to get app running assuming prebuilt.
* if I run boot without argument it runs both nginx and hhvm
* as you can see am using community image for mysql... aok.
* done! docker machine or coreos or deis and go!
Reality kicks in
Developers not ready for this.
Data Persistence.
I'm not building this for me.
* running windows ( WAMP ), Apache+modphp != nginx+hhvm
* really really shitty internet connection
* Language barrier makes teaching devs hard.
* have to persist data somewhere rackspace aaS offerings not as rich, want to run own db etc
* I'm just helping out, I may not be the one to operate it.
Config Management is dead.
Long live Config Management
Docker + Config Management
Chef,Puppet,Ansible
CM + Docker gives the best of both worlds.
Docker ~= packaging format
* At this point, all major CM has very functional docker integration.
* Using CM and Docker together gives you a way to move at your own pace to a more `docker` way of doing things.
* Start by using CM for managing backing services and state and to manage docker container running.
* Docker becomes roughly equiv of a packaging format. Start/Stop container not much diff to a service.
Docker Cookbook
https://supermarket.chef.io/cookbooks/docker
Installs Docker
Provides resources for most docker actions.
Puppet and Ansible both have roughly equivalent tooling.
* installs docker on most OSes
* LWRPs for build, run, push, etc
* anything can do in docker can do via cookbook resources
* at this point, all CM has similar stories.
Docker Cookbook
docker_container 'registry' do
detach true
port '5000:5000'
action [:run]
end
docker_image 'scoutapark_app' do
repository 'scoutapark/app:#{version}'
registry '127.0.01:5000'
action [:pull]
end
docker_container 'scoutpark_app' do
image "127.0.01:5000/scoutapark/app:#{version}"
port '8080:8080'
env 'MYSQL_HOST=10.0.0.22'
action [:run]
end
* Run a local registry ( backed by cloud files, code snipped )
* Run scoutapark nginx container pulled from local registry.
* Actually using runit to manage running containers, using runit cookbook resources ofc.
* creates a service to run the container.
Scoutapark Cookbook
Cookbook lives in app repo /cookbook
Vagrantfile for development environment
Meez for cookbook skeleton, testing tooling.
two role recipes - 'web' and 'database'
web can do nginx+hhvm, apache+mod_php, docker
database can do mysql or RAX CloudDB
local .chef/knife.rb
* sort of env sort of app cookbook, heavy usage of community cookbooks.
* Vagrant to deploy local dev VM ... still trouble with internet speeds.
* Used meez for skel + test framework. rubocop, chefspec, serverspec, TK
* web and database recipes as well as common and security hardening
* Attribute to choose between ngix, apache, docker
* Attribute to choose between mysql or DBaaS
* local .chef/knife.rb has all but user and key. expects ~/.chef - same with RAX creds.
Scoutapark Cookbook
vagrant up
kitchen test
knife-rackspace
* as well as docker-compose we now have 3 new ways to run/test an environment.
* vagrant up for a local or cloud (using rax provider) dev environment
* kitchen test for cookbook development/testing
* knife-rackspace for deploying a test/stage/prod environment.
* need to look at chef-provisioning or terraform.
CM + Docker
* we have chef serving up our docker containers
* still some questions around operations.
* still have some problems to solve.
* logging, monitoring, multiple processes, config files...
Docker logging
log to stdout/stderr
set log files in nginx to /dev/stdout and /dev/stderr
* key is to never write logs to a file in the container
* most app configs allow setting logfile location. used /dev/stdout|stderr
Docker logging
# /etc/nginx/nginx.conf
daemon off;
user app app;
pid /app/nginx.pid;
error_log /dev/stderr;
access_log /dev/stdout;
...
* example nginx config logs set to /dev/stdout
* also note daemon set to off - foreground.
* set to run as user 'app'
Docker logging
Logspout!
sends docker logs -> external syslog
docker_container 'gliderlabs/logspout' do
detach true
volume '/var/run/docker.sock:/tmp/docker.sock:ro'
command syslog://logs.papertrailapp.com:55555
name 'logspout'
end
* logspout by progium saves a ton of work. as long as all apps log to stdout
* watches the docker log stream via mapped socket
* forwards logs to one or more syslog servers defined at runtime
* can be configured at runtime or via api
* currently using free papertrail tier, but will switch to elk.
* also logging all OS logs to papertrail via rsyslog.
Docker monitoring
Docker uses libcontainer for cgroups/namespaces.
metrics exposed in weird places in /proc
Example - /sys/fs/cgroup/memory/lxc/[longid]/
Hard to track where all of these metrics live.
* docker used to use lxc, now accesses cgroups/namespaces via libcontainer
* container metrics do get exposed by OS, but are hard to find.
* Can use collectd or sensu or similar to collect these metrics
* Docker 1.5 comes with some basic metrics in docker API
Docker monitoring
CAdvisor
Native docker metrics
Web UI
Rest API
InfluxDB and Prometheus outputs.
Probably needs to support Graphite.
* Cadvisor was written at google for kubernetes, but runs well for any dockered system
* Has a Web UI to look at local metrics to the machine
* Has a rest API that you can interrogate for metrics
* supports sending data to InfluxDB and Prometheus.
* needs to add support for more metrics storage engines
Docker monitoring
CAdvisor
docker_container 'google/cadvisor:latest' do
detach true
volume [
'/:/rootfs:ro',
'/var/run:/var/run:rw',
'/sys:/sys:ro',
'/var/lib/docker/:/var/lib/docker:ro'
]
port '127.0.0.1:8080:8080'
name 'cadvisor'
end
* the recipe that I install docker with also starts cadvisor.
* pretty simple to run, note a bunch of volume mounts
* most are read only
* only allowing localhost to reach the web ports
Docker monitoring
CAdvisor::InfluxDB::Grafana
* sensu/collectd -> influxdb for non-containers
* grafana for viewing metrics
* would prefer to use SaaS, but not a lot of good container options
Configs in Containers
templating engine written in GO.
key/value pairs from a number of supported storage engines.
Environment Variables, ETCD, Consul, etc.
* we're using confd, is a templating engine written in GO
* it is designed with SD in mind ( etcd,consul,etc) but also supports ENV
* go templates are fairly feature rich.
* there's no better way to handle writing config files in containers.
Configs in Containers
# conf.d/app_Config_database.php.toml
[template]
src = "app_Config_database.php"
dest = "/scoutapark/app/Config/database.php"
owner = "app"
group = "app"
mode = "0750"
keys = [
"/db"
]
check_cmd = "apachectl configtest"
reload_cmd = "apachectl restart"
* confd metadata
* src, dest, owner, group, mode
* set what keys/env vars to poll
* check_cmd - run a script for syntax checking apachectl configtest
* reload command - on template change, run a command. apachectl restart
Configs in Containers
# templates/app_Config_database.php
class DATABASE_CONFIG {
public $default = array(
'datasource' => 'Database/Mysql',
'persistent' => false,
'host' => '{{ getv "db/host" }}',
'login' => '{{ getv "db/user" }}',
'password' => '{{ getv "db/pass" }}',
'database' => '{{ getv "db/name" }}',
'prefix' => 'scout_',
'encoding' => 'utf8',
);
}
* only template config options that change from env to env
* otherwise can grow to be a large mess of values that need to be configured.
* can do loops and all sorts of crazy stuff.
Configs in Containers
#!/bin/bash
# bin/boot
export DB_HOST=${DB_HOST:-"mysql"}
export DB_PORT=${DB_PORT:-"3306"}
export DB_USER=${DB_USER:-"scoutapark"}
export DB_PASS=${DB_PASS:-"scoutapark"}
export DB_DATA=${DB_NAME:-"scoutapark"}
confd -onetime -config-file /scoutapark/docker/confd_env.toml
exec nginx
wait
* bash script set to CMD of container
* load in env variables, set to defaults if empty
* call confd in single shot mode to write templates
* if using service discovery omit -onetime flag
* finally run nginx ( set to non-daemon and logs to stdout )
* wait helps prevent zombie processes.
Configs in Containers
$ docker run -d \
-e DB_HOST=10.0.0.1 \
-e DB_USER=scout \
-e DB_PASS=scout \
-e DB_NAME=scout \
--name scoutapark_web \
-p 8080:8080 \
scoutapark/web bin/boot
* regular docker run command
* env vars here are translated into config files via confd
* for now always mapping port:port for ease load load balancing
The Future
* what we're doing now isn't exciting, what I'm going to do is.
* always think about where want to be when doing the now
* easier to improve by evolution than revolution
The Future
Service Discovery
confd <3 service discovery
run confd in daemon mode
run etcd or consul
use {{ getenv 'ETCD_HOST' }} for env vars
* perfect example is Confd + SD
* we already have most of the plumbing to do this
* confd makes it super easy to switch to service discovery
* use etcd ( consul also fine)
* registrator to track container->etcd keys
* proxy/LB reads registrator and dynamically LBs
The Future
Docker based platform
CoreOS + fleet
Kubernetes
Deis
Mesos
* don't want to accidently build our own platform
* plenty of emerging options, we don't have to choose rigt now
The Future
Data Persistence
Use *aaS where it makes sense
Continue to use Chef where it doesn't.
* We want to outsource as much data persistence as possible where it makes sense.
* Easy win here will be database. RDS or CloudDB or ....
* However we'll likely need to continue to run some of our own.
* Do what we know here, chef.
* Our data is too important to trust to something we don't know 100%
THE END
Dan Sheppard
Paul Czarkowski / @pczarkowski