Planet Linux Australia

Syndicate content
Planet Linux Australia -
Updated: 47 min 19 sec ago

Craige McWhirter: Machine Ethics and Emerging Technologies - Paul Fenwick - LCA2016

Thu, 2016-02-04 14:04

Paul Fenwick posed a journey of questioning what the future might look in 10,000 years time and is what we're doing today good for humanity.

  • More and more white collar jobs are being automated.
  • What are all these masses going to do with their leisure time?
  • More leisure time means more innovation.
  • Covered the benefits of drones.
  • Covered the dark side of drone use.
  • LARs (Lethal Autonomous Robots) are a significant issue.
    • Enables anonymous warfare
    • Long term target monitoring and execution
  • Can be used for long term environmental monitoring.

Another excellent, informative and entertaining talk by Paul.


Added the talk below.

Craige McWhirter: The Machine - Keith Packard - LCA2016

Thu, 2016-02-04 12:21

Keith Packard

  • Switching from Processor centric computing to memory driven computing
  • Described how the memory fabric works.
  • Will be able to connect any computing node to the shared memory.
  • Illustrated node assembly.
  • Next prototype will interconnect 320 terrabytes of memory accessible storage.
  • Planning to build larger machines.
  • Putting in facilities to protect the hardware from a compromised operating system.
  • Showed how fabric attached memory connects.
  • Linux is being ported to the machine.
    • Linux with HPE changes.
    • All work is being open sourced.
  • Creating a new file system allocate mempry in 8G units.
    • Library File System (LFS)
  • Currently focussing on Librarian, machine-wide shared memory allocator.
  • Trying to provide a two level allocation scheme
  • No sparse files.
  • Locking is not global.
  • Farbic attached memory is not cache coherent
  • Read errors are signalled synchronously.
  • Write errors are asynchronous and require a barrier.
  • Went through all the areas they're working on Free Software.

Simon Lyall: 2016 – Thursday – Session 1

Thu, 2016-02-04 10:28

Jono Bacon Keynote

  • Community 1.0 (ca 1998)
    • Observational – Now book on how to do it
    • Organic – people just created them
    • Technical Enviroment – Had to know C (or LaTex)
  • Community 2.0 (ca 2004, 2005)
    • Wikipedia, Redhat, Openstack, Github
    • Renaissance – Stuff got written down on how to do it
    • Self Organising groups – Gnome, Kde, Apache foundation – push creation of tech and community
    • Diversity – including of skills , non-technical people had a seat at the table and a role.
    • Company Engagement – Starting hiring community managers, sometimes didn’t work very well
  • Community 3.0 ?
  • Why?
    • “Thoughtful and productive communities make us as species better
  • Access and power is growing exponentally
  • But stuff around is changing
    • Cellphones are access method for most
    • Cloud computering
    • CD-printers, drones, cloud, crowdfunding, Ardinino
    • Lots for channels to get things to everybody and everybody can participate
  • “We need to empower diversity of both people and talent”
  • Human brain has not had a upgrade in a long time
  • Bold and Audacious Goals
    • Openness is at the heart of all of these
    • Open source in the middle of many
  • Eg Drone
    • Runs linux
    • Open API
  • “Open Source is where Society innovates”
  • “Need to make great community leadership accessible to everybody”
  • “Predictable collaboration – an aspirational goal where we won’t *need* community managers”
  • Not just about technology
    • We are all human.
  • Tangible value vs Intangible value
    • Tangible can be measured and driven to fix the numbers
    • Intangible – trust, dignety
  • System 1 thinking vs System 2 thinking
    • Instant vs considered
  • SCARF Model of thinking
    • Status – clarity of relative importance, need people to be able to flow between them
    • Certainty – Security and predictability
    • Autonomy – People really want choices
    • R – I got distracted by twitter, I’m sure it was important
    • Fair – fairness
  • Two Golden Rules
    • We accomplish our goals indirectly
    • We influence behaviour with small actions
  • We need to concentrate to building an experience for people to who join the community
  • Community Workflow
    • Communication – formal, inclormal? Coc? Tech to use?
    • Release sceduled, support?
    • How to participate, tech, hackthons
    • Government structure
  • Paths for different people
    • New developers
    • Core Developers
    • Consumers
    • Downstream Cosutomers
    • Organizations
  • Opportunity vs Belonging
  • Questions
    • Increasing Signal to Noise ratio – Trolls are easy[er], harder for people who are just no deft in communication. Mentorship can help
    • Destructive communities (like 4chan) , how can technology be used to work against these – Leaders need to set examples. Make clear abusive behavour towards others. Won’t be able to build tools that will completely remove bad behaviour. Had to tell destructive vs direct automatically but they can augmented.
    • What about Linus type people? – View is that even though it works for him and it is okay with people he knows. Viewed inwards by others it sets a bad example.

Using Persistent Memory for Fun and Profit by Matthew Wilcox

  • What is it?
    • Retains data without power
    • NV-DIMMs available – often copy DRAM to flash when power lost
    • Intel 3D X-point shipping in 2017. will become more a standard feature
  • How do we could use it
    • Total System persistence
      • But the CPU cache is not backed up, so pending writes vanish
    • Application level persistence
      • Boot new kernel be keep the running apps
      • CPU cache still
    • Completely redesigned operating system to use
      • But we want to use in 2017
    • A special purpose filesystem
      • Implementation not that great
    • A very fast block device
      • Usaged as very fast cache for apps really need it. Not really general purpose
    • Small modifications to existing file systems
      • On top of ext2 (xip)
      • DAX
  • How do we actually use it
    • New CPU instructions ( mostly to make sure encourage that things are flushed from the CPU cache)
    • Special purpose programming language shouldn’t be needed for interpreted languages. But for compiled code libraries might be needed
  • NVML library
  • Stuff built on NVML library so far.
    • Red-Black tree, B-tree, other data-structures
    • Key-value store
    • Fuse file system
    • Example MySQL storage engine
  • Resources
  • Questions
    • In 2017 will we have mix of persistent and non-persistent RAM? – Yes . New Layer in the storage hierarchy
    • Performance of 3d will be slower a little slow than DRAM but within ballpark, various trade-offs with other characteristics
    • Probably won’t have native crypto

Dropbox Database Infrastructure by Tammy Butow

  • Dropbox for last 4 months, previously Digital Ocean, prev National Australia Bank
  • Using MySQL for last 10 years. Now doing it FT.
  • 400 Million customers
  • Petabytes of data across thousands of servers
  • In 2012 Dropbox just had 1 DBA, but was huge then.
  • In 2016 it has grown to 9 people
  • 6000 DB servers -> DB Proxy -> DB as a service (edgestore) -> memcache -> Web Servers (nginx)
  • Talk – Go at Dropbox, Zviad Metreveli on Youtube
  • Applications talk directly to edgestore not directly to database
  • vitess is mysql proxy (by youtube) similar to what dropbox wrote. Might move to that
  • Details
    • Percona 5.6
    • Constantly upgrading (4 times in last year)
    • DBmanager – service we manage mysql via
  • Each Cluster is proiamry + 2 replicas
  • Use xtrabackup ( to hdfs locally and s3)
  • Tools
    • Tasks grow and take time
    • DBmanager
      • Automating DB operations
      • Web interface with standard operations and status of servers
      • Cloning Screen
      • Promotion Screen
      • Create and restore backups
      • WebUI gives you feedback and you can see how things are going. Don’t need magic command lines. Good for other teams to see stuff and do stuff (options right in front of them).
      • Benchmarking
      • Database job scheduling and prioritization. Promotion will take priority over anything else.
      • Common logging, centralized server and nice gui that everyone can see
    • HERMES
      • Availbale on dropbox github
      • Visable all quests and actions that need to be done by the team
    • Monitoring
      • Grafana
  • Performance
    • Improving backup and restore speed.
      • LZOP
      • xtrabackup
  • Auto-remediation (naoru) – up on github at some point
  • Inventory Management
    • Machine Database (MDB)
    • Has tags for things like kernel versions
  • Diognostics
    • Automated periodic tcpdump
    • Tools to kill long running transactions
    • List current queries running
    • atop
  • The Future
    • Reliabilty, performance and cost improvements
    • Config management
    • Love the “Go Programming Language” by Kernighan
    • List of Papers they love
  • Questions
    • Using percona not mariadb. They also shard not cluster DBs
    • Big Culture change from Back to Dropbox – At Bank tried to decom old systems, reduce risk. At Dropbox everyone is very Brave and pushing boundarys
    • machine database automatically built largely
    • Predictive Analysis on hardware – Do some , lots of dashboards for hardware team, lifecycle management of hardware. Don’t hug servers. Hug the hardware class instead.
    • Rollbacks are okay and should be easy. Always be able to rollback a change to get to back to a good stack.


Craige McWhirter: LCA2016 Thursday Keynote - Jono Bacon

Thu, 2016-02-04 10:09

Jono Bacon spoke about how open communities are changing the world and how they may be improved in the future.

Community 1.0
  • Early Free Software communities were built from observing other groups around them and figuring things out as they went along.
  • Very high technical barrier of entry
Community 2.0 The Renaissance
  • Allowed broader participation, with Wikipedia as an example.
  • Knowledge had been built to allow people to start in the community from a common point
  • Self organising groups
  • Enabled greater diversity
  • Companies began engaging with communities.
What Does 3.0 Look Like?
  • How do we build effective reproducible communities?
    • Thoughtful and productive communities advance the human race,
  • Sharing the knowledge on how to build effective communities is going to be
  • Covered ubiquitous computing growth, 3D printing, Arduino etc
  • Crowd funding as one method of empowering consumers.
  • Not just consumption but empowering people to have better lives, key.
  • We need to empower diversity in all it's forms.
  • Openness is the greatest enabler.
  • The principles of openness are flowing through all forms of technology, life and work.
  • In a world worried about AI, we the people should be ensuring that it's open and taking control.

"Open Source is where society innovates" - Jono Bacon

  • We need to crack predictable collaboration. Making great great community leadership available everyone.
  • We can do better, we've only scratch the surface with our success thus far.
How do we do this?
  • For self respect we need to contribute. To contribute we need access.
  • Jono realised that his role as community manager was to help other contributors be as effective as possible with their time when they're contributing.
  • Discussed the difference between system 1 and system 2 thinking.
  • However behavioural economics is hard to apply in practice.
  • The principles can be pulled out and used though.
  • Discussed SCARF model of social threats and rewards.
  • From this model we can figure out how to put this into practice.
    • We accomplish goals indirectly. Gave Boeing as an example.
    • We influence behaviour with small actions. Recommended the book Lunch.
  • Build comprehensive rewarding experiences.
  • Need to make building a successfully structured community easy.
  • Described experiences from different stakeholder perspectives.
community_3.0 = { system 1 and 2 thinking + behavioural patterns + workflow + experiences + pacakaged guidance }

The most important feeling we can create is a sense of belonging.

Craige McWhirter: Introduction to monitoring with Prometheus - Jamie Wilkinson - LCA2016

Wed, 2016-02-03 17:21

Jamie Wilkinson gave on overview of the Prometheus monitoring tool, based on the Borgmon white paper released by Google.

  • Monitoring complexity was becoming expensive.
  • Borgmon inverted the monitoring process
    • Was heavily relied upon at Google.
  • Prometheus, Bosun, Riemann are stream based monitoring like Borgman.
  • Prometheus scrapes /varz
  • Sends alerts as key value pairs
  • Using shards for scaling.
  • Defines targets in a YAML file.
  • Data storage is in a global database in memory
  • Use "higher level abstractions to lower cost of maintenance
  • Use metrics, not checks
  • Design alerts based on service objectives.

Another brilliant monitoring talk from Jamie.

Simon Lyall: 2016 – Wednesday – Session 3

Wed, 2016-02-03 16:28

The future belongs to unikernels. Linux will soon no longer be used in Internet facing production systems. by Andrew Stuart

  • Stripped down OS running a single application
  • Startup time only a few milli-seconds
  • Many of the current ones are language specific
  • The Unikernel Zoo
    • MirageOS – Must be written in OCaml
    • Rump –  Able to run general purpose software, run compiled posix applications, largely unmodified. Can have threading but not forking
    • HalVM – Must be coded in Haskell
    • Ling – Erlang
    • Drawbridge – Microsoft research project
    • OSv – More general purpose
    • “Something about Unikernels seems to attract the fans of the ‘less common’ languages”
    • plus a bunch more..
  • Unikernels and security
  • Bunch of people point out problems and alternative solutions the unikernel are trying to solve.


An introduction to monitoring and alerting with timeseries at scale, with Prometheus by Jamie Wilkinson

  • SRE ultimately responsible for the reliability of , less that 50% of time on ops
  • History of monitoring, Nagios doesn’t scale, hard to configure
  • Black-box monitoring for alerts
  • White-box monitoring for charts
  • Borgmon at Google, same tool used my many teams at google
  • Borgmon not Open Source, but instead we’ll look at Prometheus
  • Several alternatives alternatives
  • Borgman
  • Alert design
    • SLI – a measurment
    • SLO – a goal
    • SLA – economic incentives
  • Philosopy
    • Every time you get paged you should react with sense of urgency
    • Those that are not important shouldn’t be paged on, perhaps just to console
  • Instrumentation
    • Client exports a interface usually http , prometheus polls /metrics on this server gets plain page with numbers
    • Metrics are numbers not strings
    • Don’t need timestamps into data
  • Tell prometheus where the targets are in the “scrape_configs”
    • All sorts of ways to find targets (DNS, etc)
  • Variables all have labels, name, things like localtions
  • Rule evaluation
    • recording rules
    • tasks run built in fuctions like sum up data by label (eg all machines with the same region label), find rate of change etc
  • Pretty graphs shown in demo
  • Questions
    • Prometheus exporting daemon/proxy
    • Language ability to support things like flapping detection/ignore
    • Grafana support for Prometheus exists


Craige McWhirter: The future belongs to unikernels - Andrew Stuart - LCA2016

Wed, 2016-02-03 16:24

Andrew Stuart gave an overview of the current state of unikernels:

  • Unikernel zoo is increasing.
    • MirageOS is the most mature at present and requires code written in OCaml.
    • HalVM requires you code to be written in Haskell
    • Ling requires your code to be written in Erlang.
    • runtime.js some thing as the above but in JavaScript.
    • OSv is not language specific and very minimalist.
    • rump kernels is essentially a very stripped down version of NetBSD and will run some other unikernels.
  • Threading, not forking.
  • Might be a Linux based unikernel coming.
Unikernels and Security
  • Suggests machines with user sign-in capabilities will be come less come due to security risks.
  • Unikernels are not invulnerable.
  • MirageOS have a bitcoin pinata.

Craige McWhirter: Sentrifarm - open hardware telemetry system for Australian farming conditions - Andrew McDonnell - LCA2016

Wed, 2016-02-03 15:02

Andrew McDonnell created Sentrifarm in 2015.

  • Low power
  • Distributed
  • Using radio for communication
  • Local storage
  • Cheap
  • They entered Hackaday - actual entry page.
    • Wanted to learn new skills
    • Have fun
    • Experiment
    • Perhaps produce something useful
  • There were lots of discarded prototypes
  • So many cheap devices facilitating experimentation.
  • Radio links were not quite as open as he would have liked.
  • Used Lora based ISM-band radio
  • Learned how much easier it is to have PCBs fabricated these days.
  • Fabrication lead times can be about 6 months.
Open Hardware Components
  • 8 devices Carambola2 - Linux OpenWRT board
  • Replaces need for Arduino IDE
  • Open Source
  • IDE agnostic
MQTT for communication
  • Specifically MQTT-SN for low bandwidth
  • Packages
    • mosquitto
    • mqtt_sn_tools
    • arduino-mqtt-sn
  • Gateway runs OpenWRT

Andrew provided an overview of how the gateway processing model worked.

  • Ubuntu 14.04
  • Docker 1.8.3
  • Carbon + Whisper + Graphite
  • Grafana
  • Custom Python scripts
  • Millions of lines of code and Andrew only had to write 7.

  • 3D printed some components.

    • Made a custom holder for the PCB
  • Used OpenSCAD to design the component.
  • Made the antenna himself with plans off the Internet.
    • Got range up to 9km.

Andrew's project is an ingenious solution to a serious problem. I need one of these for myself!


Added the talk itself below.

Simon Lyall: 2016 – Wednesday – Session 2

Wed, 2016-02-03 14:28

Welcoming Everyone: Five Years of Inclusion and Outreach Programmes at PyCon Australia by Christopher Neugebauer

  • How to bring more people to community run events
  • Talk is not about diversity in tech
  • Talk is about “Outreach and Inclusion in Events”
  • Outreach = getting them in , Inclusion = making them feel welcome
  • About funding programmes for events
  • FOSS happens over the Internet , face-to-face is less common than in other areas/communities
  • Events are where you can see the community
  • BUT: Going to a conference costs money – travel, rego, parking, leave from job
  • Events have equality of access problem
  • Inequity of access is  a problem with diversity
  • Solution: Run outreach programmes
  • Money can reduce the barriers, just spending money can help solve the problem
  • Pycon Australia has had outreach for last 5 years
  • FOSS vs other outreach programmes
    • Events have easy goals, define ppl/numbers to target, exact things to spend on, time period defined
    • Similar every year, similar result each year
    • Long-term results are ill-defined
    • Engagement is hard to track
  • Pycon Australia
    • Fairly independent of Python software foundation
    • Biggest Pycon within 9 hours of flying
    • Pycon US – 2500 attendees, $200k on financial attendance
    • Pycon Aus 2015 – 450 attendees , 5-8% of budget on funding
  • 2011
    • Harassment and Codes of Conduct were a big thing
    • Gender diversity policy, code of conduct, 20% speaks were women, First Gender diversity grants
    • 2 Grants, – 1 ticket and 1 Ticket + $500 funded out of general conf budget
    • 7 strong applicants at time when numbers were looking low (later picked up)
    • Sponsor found and funded all 7 applicants
  • 2012
    • 1st of 2 years running conf in Hobart
    • Moving from Sydney is hard. Australia big and people have to fly between cities (especially to Hobart)
    • Hobart long way away for many people and small number of locals
    • Sponsor increased funding to $700, funded 10 people for $500 + ticket
    • Previous grant recipient from 2011 was speaking in 2012
  • 2013
    • Finding more speakers from more places
    • Outreach and Speaker support run out of the same budget, cap removed on grants so International travel possible.
    • Anyone could apply removed purely on gender limit. So other people who needed funding could apply. Eg Students, teachers, geographic minorities
    • $12,500 allocated
    • As more signups and more money came in more could go to the assistance budget
    • If remove gender targeting then then what happens to diversity
    • Got groups like GeekGirlDinners to target people that needed grants rather than directly chasing people to apply.
    • Over half aid budget going to women
    • Teachers good force multiplies
  • 2014
    • Lost previous diversity Sponsor
    • Previously $5k from Sponsor + $7k from general fund.
    • Pycon US – Everybody pays to attend ( See Essay by Jesse Noller – Everybody Pays )
    • Most speakers have FOSS-friendly employers or can claim money
    • Argument: Some confs make everybody pay no matter their ability.
    • Told speakers that by default they would be charged, but by charge they weive it by just asking. Also said where the money was going and prioritised speakers to assistance. Also all organisers paid
    • Extra money from about $7000
    • Simplified structure of grants, less paperwork, just gave people a budget. Worked well since many people went with good deals.
    • Caters better for diverse needs
    • Also had Education Miniconf, covered under teacher traning budget. Offered to underwrite costs of substitute teachers for schools since that is not covered by normal school professional-dev budget
  •  Results
    • Every time at least one funding recipient has spoken at next conference
    • Many fundees come back when get professional jobs
    • Evangelize to the friends
  • Discovery
    • expanding fund gets people you might not expect
    • Diverse people have diverse needs
    • Avoid making people do paperwork, just give them money
    • Sponsors can make boot-strapping starting a programme easier
    • Don’t expect 100% success
    • Budget liberally, disburse conservatively
    • Watch out for immigration scams
    • Decline requests compassionately
  • Questions
    • Weekend hard for Childcare – Not heavily targeted
    • Targeting Speakers for funding rather than giving all of them means it gets to go a lot further. Better Bang for buck

Sentrifarm – open hardware telemetry system for Australian farming conditions by Andrew McDonnell

  • Great time to be a maker, everybody is able to make something
  • Neighbour had problem with having to measure grass fire danger in each paddock before going out with machinery during summer
  • Needs Wind Speed, temperature, humidity
  • Sentrifarm
    • Low power, solar
    • distributed
    • Works in area with slow internet, sim card expense adds up however
    • Easy to use for farmer, access via their farm.
    • Data should not be owned by cloud provider
  • Hackerday Prize
    • Build “something that matters”
    • Prizes just for participating
    • Document progress, produce a video
  • Our Goals
    • Cheap and Cheerful
    • Aussie “bush mechanic” ehtos
    • Enjoy the adventure
  • Used stuff from 24+ other opensource projects
  • Prototyping
    • Tried out various micro-controllers an other equipment
    • Most you could only buy for a few dollars
    • Tools – Bus Pirate
  • Radio links
    • ISM-band radio module “Lora” technology
    • SPI interface, well documented SX1276
    • $20 for the module
    • Propriety radio protocol, long rang low power, but open interface on top of it
  • Eagle used (alt is KiCAD) to design circuit
    • Build own shields to plug sensors and various controllers into
  • – run one command, creates a arduino project and builds with one command for multiple micro-controllers
  • MQTT-SN – communications protocol for low-bw links.
  • Breakdown of his stack, see his slides for details
  • Backend Software
    • Ubuntu
    • Docker
    • Carbon + Whisper + Graphite, Grafana
    • “Great time to be a hacker, using who knows how many lines of code and only had to write 7 to get it to work together”
  • Grafana hard to setup but found a nice docker container
  • Data kept separately from the container
  • Goal to get power down
  • Used 3D-printer to create some parts from mounting bits.
    • OpenSCAD – Language to design the parts
  • Range of Lori of 5km un-evalated , 9km up a tower with sinple home-built antenna
  • Won a top-100 prize at Hackaday of a t-shirt
  • You can do it
  • Questions
    • Ask home survives weather? – Not a lot of experience yet, some options
    • Home likely others to use? – Maybe but main gaol was to building it


Craige McWhirter: Usable formal methods - are we there yet? - Stefan Götz - LCa2016

Wed, 2016-02-03 12:17

Stefan Götz

  • Software reliability is often defined by industry standards.
  • Software analysis can be divided into three parts:
    • Static analysis
      • Examines code, no compilation or execution.
      • Share input with compilers
      • Example static analisers:
        • BLAST, Cppcheck, Eclipse, Frama-C
        • LLVM/CLang
        • Sparse
        • Splint - used by eChronos
    • Proof systems
    • Model checking
  • Performs patter matching
  • Understands c-types
  • Language model and rule matching
  • Control and data flow analysis
  • Similar to compiler setup
  • Run against entire application code.
  • Improved auto generated code and readability.
  • Found incorrect character conversion.
  • Discovered signal sets unintentionally returning a boolean,
  • Some false positives with unused code.
  • Some macros were not picked up.
  • Some variable initialisation not picked up.
  • Works very well over all.

Felt the time invested in using splint was well spent and brings a lot of piece of mind to the project.

Model Checking
  • Uses CBMC
  • Requires a little plumbing code and training.
  • Made them reconsider and improve execution timing.
  • Scalability requires improvements.
  • Being integrated with eChronos.
Are we there yet?
  • Static analysis Open Source tools need improving and an established best practices.
  • Model checking is not yet out of the box

Craige McWhirter: CloudABI - Ed Schouten - LCa2016

Wed, 2016-02-03 12:17

Ed Schouten provided a detailed tour of Capsicum and CloudABI.

  • AppArmor is an after thought
  • Puts the burden back on users
  • Not linked to security policies.
  • Capsicum is a FreeBSD method that sandboxes software
  • Works well with small applications but doesn't scale.
  • Questions why UNIX can't run third party binaries safely.
What is CloudABI?
  • CloudABI is a POSIX-like runtime environment based on Capsicum.
  • Capability based security with less foot shooting.
  • Global namespaces are entirely absent
    • By default can only perform actions with no global impact.
  • Symbiosis, not assimilation as it can run side by side with traditional applications.
  • File descriptors are used to provide additional rights.
  • Provided an example of using CloudABI to provide a secure web service.
  • You can use wrappers to provide features missing from CloudABI.
  • Only has 58 system calls. Incredibly compact.
  • Working towards having support for more POSIX operating systems.
  • Allows reuse of binaries without compilation.
  • Provided an example of a simple CloudABI ls program.
  • How to execute it via the shell
  • Feels there's scalability problems with CloudABI.
  • Wrote cloudabi-run to make it feel less clunky to run.
  • Replace CLI arguments with a YAML file.
  • Easy to configure.
  • Impossible to invoke programs with the wrong file
  • Reduces start-up complexity.
  • Gave an example of CloudABI as the basis of a cluster management suite.
  • Provides a 100% accurate dependency graph.
  • Gave an example of "CloudABI as a Service".

Simon Lyall: 2016 – Wednesday – Session 1

Wed, 2016-02-03 11:28

Going Faster: Continuous Delivery for Firefox by Laura Thomson

  • Works for Cloud services web operations team
  • Web Dev and Contious delivery lover
  • “Continuous delivery is for webapps” – Maybe not just Webapps? Maybe Firefox too
  • But Firefox is complicated
  • Process very complicated – “down from 5 source control systems to 3”
  • But plenty of web apps are very complicated (eg Netflix)
  • How do we continuous deliver Firefox
  • How it works currently
    • Release every 6 weeks
    • 4 channels – Nightly -> Aurora -> Beta -> release
    • Mercurial Repo for each channel
  • Release Models
    • Critical Mass – When enough is done and it is stable
    • Single Hard deadline – eg for games being mass released
    • Train Model – fixed intervals
    • Continuous Delivery
  • Deployment Maturity Model
  • Updates
    • New Build -> Generate  a diff -> FF calls back -> downloads and updates
    • Hotfixs
    • Addons automatically updated
  • Currently pipeline around 12 hours long, lots of tests and gatekeeping
  • “Go Faster”
    • System add-ons
    • Test Pilot
    • Data Separate from code
    • Downloadable content
    • Features delivered as web apps
  • System addons
    • Part of core FF, modularized into an add-on
    • Build/test against existing FF build, a lot smaller test
    • Updated up to daily(for now) on any release channel
    • signed and trusted
    • Restartless updates
      • install or update without a browser restart
      • Restarts suck
      • Restartsless coming soon for system add-ons
    • Good for rapid iteration, particularly on the front-end
    • Wrappers for services
    • Replacing hotfixes
  • Problems with add-ons
    • Localalisation
    • Optimizing UX : Better browser faster vs update fatigue
    • Upfront telemetry requirements
    • Dependency mngt on firefox
    • Dependency management between system add-ons (coming soon)
  • Add-ons in flights
    • Firefox hello is already an add-on
    • Currently in beta in 45
    • First beta updates before 46
  • Test Pilot
    • Release channel users opt in to new features
    • Release channel users different from pre-release ones
    • Developed as regular ad-ons (not system add ons)
    • Can graduate to system add-ons by flipping a bit
  • Data should be seperate from code
    • Sec policy
    • blocklists
    • tracking protection list
    • dictionaries
    • fonts
  • Many times Data update == release , this is broken
  • Also some have their own updaters
  • Kinto
    • Lightweight JSON storage with sync, sharing, signing
    • Natice JSON over http
    • niceties of couchDB backed by postgressDB
  • How Kinto Works
    • pings for updates
    • balrog supplies link to kinto
    • signed data downloaded, checked, applied
  • Kinto good for
    • Add-ons block list
  • Downloadable Content
    • Some parts of the browser may not need frequently
    • May not be needed on startup
    • eg languages packs, fonts for Firefox on Android
  • Features delivered remotely
    • Browser features delivered as web apps
    • Pull in content from the server
    • in a early stage
  • Futures
    • Easy for projects to impliment
    • Better “knobs and dials” (canaries A?B, data viz)
    • Pushed based updates
    • Simpler localisation
  • Questions
    • They support rollbacks
    • Worst case: Firefox has a startup crash
    • Not sure sure ice weasel would fit in.
    • How will effect ESR channel? – Won’t change, they will stay security-only
    • Bad Addons – Hate ones that reporting user-data, crashers (eg skype toolbar at one point), Highjack your browser and change settings
    • There is much collaboration between [open source] browsers
    • You are avoiding the release cycle, planning to speed it up – Lots of tests that can’t get rid of all, working on it but not a simple thing to solve.


Craige McWhirter: LCA2016 Wednesday Keynote - Catarina Mota

Wed, 2016-02-03 10:40

Catarina Mota spoke about how open source software changed her life.

  • Discussed the spilling over of online communities in the real world, with 3D printing as an example.
  • In particular covered the RepRap printer and community.
  • RepRap printer provided the ability to short circuit traditional manufacturing.

"think of RepRap as a China on your desktop" - Chris de Bona

  • Open Source (GNU GPL) is core the RepRap mission and it's success.
  • Catarina describe sher house as an Open Source house.
  • Followed Open Source principles to design and build her home.
  • The goal was more affordable and sustainable housing.
Why Does it Matter?
  • Machines are not neutral.
  • Talked about obsolescence and e-waste.
  • Phones are big contributor,
  • Phonebloks is a phone designed to be phone worth keeping with entirely replaceable components.
  • Users are co-creators.
  • Meant to be repaired, transformed, adapted and appropriated.

Catarina gave a wonderful talk on open source software, hardware and the way it's changing the world. Great talk.

Lev Lafayette: Reviving a Downed Compute Node in TORQUE/MOAB

Wed, 2016-02-03 10:29

The following describes a procedure for bringing up a compute node in TORQUE that's marked as 'Down'. Whilst the procedure, once known, is relatively simple, investigation to come to this stage required some research and to save others time this document may help.

1. Determine whether the node is really down.

Following an almighty NFS outage quite a number of compute nodes were marked as "down". However the two standard tools, `mdiag -n | grep "Down"` and `pbsnodes -ln` gave significantly different results.

read more

Colin Charles: Donating to an opensource project when you download it

Tue, 2016-02-02 20:25

Apparently I’ve always thought that donating to opensource software that you use would be a good idea — I found this about Firefox add-ons. I suggested that the MariaDB Foundation do this for downloads to the MariaDB Server, and it looks like most people seem to think that it is an OK thing to do.

I see it being done with Ubuntu, LibreOffice, and more recently: elementary OS. The reasoning seems sound, though there was some controversy before they changed the language of the post. Though I’m not sure that I’d force the $0 figure. 

For something like MariaDB Server, this is mostly going to probably hit Microsoft Windows users; Linux users have repositories configured or use it from their distribution of choice! 

Simon Lyall: 2016 – Sysadmin Miniconf – Session 3

Tue, 2016-02-02 17:28

The life of a Sysadmin in a research environment – Eric Burgueno

  • Everything must be reproducible
  • Keeping system up as long as possible, not have an overall uptime percentage
  • One person needs to cover lots of roles rather than specialise
  • 2 Servers with 2TB of RAM. Others smaller according to need
  • Lots of varied tools mostly bioinformatics software
  • 90TB to over 200TB of data over 2 years. Lots of large files. Big files, big servers.
  • Big job using 2TB of RAM taking 8 days to run.
  • The 2*2TB servers can be joined togeather to create a single 4TB server
  • Have to customize environment for each tool, hard when there have lots of tools and also want to compare/collaborate against other places where software is being run.
  • Reproducible(?) Research

Creating bespoke logging systems and dashboards with Grafana, in fifteen minutes – Andrew McDonnell

Live Demo

Order in the chaos: or lessons learnt on planning in operations – Peter Hall

  • Lead of an Ops team at REA group. Looks after dev teams for 10-15 applications
  • Ops is not a project, but works with many projects
  • Many sources of work, dev, security, incidents, infrastructure improvement
  • Understand the work
    • Document your work
    • Talk about it, 15min standup
  • Scedule things
    • and prepare for the unplanned
    • Perhaps 2 weeks
    • Leave lots of slack
  • Interruptions
    • Assign team members to each ops teams
    • Rotating “ops goal keeper”
    • Developers on pager
  • Review Often
  • Longer term goals for your team
  • Failure demand vs value demand.
    • Make sure [at least some of] what you are doing is adding value to the environment


From Commit to Cloud – Daniel Hall

  • Deployments should be:
    • fast – 10 minutes
    • small – only one feature change and person doing should be aware of all of what is changing
    • easy – little human work as possible, simple to understand
  • We believe this because
    • less to break
    • devs should focus on dev
    • each project should be really easy to learn, devs can switch between projects easy
    • Don’t want anyone from being afraid to deploy
  • Able to rollback
    • 30 microservices
    • 2 devs plus some work from others
  • How to do it
    • Microservices arch (optional but helps)
    • git , build agent, packaging format with dependencies
    • something to run you stuff
  • code -> git -> built -> auto test -> package -> staging -> test -> deploy to prod
  • Application is built triggere by git
    • script in each repo
  • Auto test after build, don’t do end-to-end testing, do that in staging
  • Package app – they use docker – push to internal docker repo
  • Deploy to staging – they use curl to push json mesos/matathon with pulls container. Testing run there
  • Single Click approval to deploy to staging
  • Deploy to prod – should be same as how you deploy to staging.

LNAV – Paul Wayper

  • Point at a dir. read all the files. sort all the lines together in timestamp order
  • Colour codes, machines, different facilities(daemons). Highlights IPs addresses
  • Errors lines in red, warning lines in yellow
  • Regular expressions highlighted. Fully pcre compatable
  • Able to move back and force and hour or a day at a time with special keys
  • Histograph of error lines, number per minutes etc
  • more complete (SQL like) queries
  • compiles as a static binary
  • Ability to add your own log file formats
  • Ability share format filters with others
  • Doesn’t deal with journald logs
  • Availbale for spel, fedora, debian but under a lot of active development.
  • acts like tail -f to spot updates to logs.


Russell Coker: Compatibility and a Linux Community Server

Tue, 2016-02-02 17:26

Compatibility/interoperability is a good thing. It’s generally good for systems on the Internet to be capable of communicating with as many systems as possible. Unfortunately it’s not always possible as new features sometimes break compatibility with older systems. Sometimes you have systems that are simply broken, for example all the systems with firewalls that block ICMP so that connections hang when the packet size gets too big. Sometimes to take advantage of new features you have to potentially trigger issues with broken systems.

I recently added support for IPv6 to the Linux Users of Victoria server. I think that adding IPv6 support is a good thing due to the lack of IPv4 addresses even though there are hardly any systems that are unable to access IPv4. One of the benefits of this for club members is that it’s a platform they can use for testing IPv6 connectivity with a friendly sysadmin to help them diagnose problems. I recently notified a member by email that the callback that their mail server used as an anti-spam measure didn’t work with IPv6 and was causing mail to be incorrectly rejected. It’s obviously a benefit for that user to have the problem with a small local server than with something like Gmail.

In spite of the fact that at least one user had problems and others potentially had problems I think it’s clear that adding IPv6 support was the correct thing to do.

SSL Issues

Ben wrote a good post about SSL security [1] which links to a test suite for SSL servers [2]. I tested the LUV web site and got A-.

This blog post describes how to setup PFS (Perfect Forward Secrecy) [3], after following it’s advice I got a score of B!

From the comments on this blog post about RC4 etc [4] it seems that the only way to have PFS and not be vulnerable to other issues is to require TLS 1.2.

So the issue is what systems can’t use TLS 1.2.

TLS 1.2 Support in Browsers

This Wikipedia page has information on SSL support in various web browsers [5]. If we require TLS 1.2 we break support of the following browsers:

The default Android browser before Android 5.0. Admittedly that browser always sucked badly and probably has lots of other security issues and there are alternate browsers. One problem is that many people who install better browsers on Android devices (such as Chrome) will still have their OS configured to use the default browser for URLs opened by other programs (EG email and IM).

Chrome versions before 30 didn’t support it. But version 30 was released in 2013 and Google does a good job of forcing upgrades. A Debian/Wheezy system I run is now displaying warnings from the google-chrome package saying that Wheezy is too old and won’t be supported for long!

Firefox before version 27 didn’t support it (the Wikipedia page is unclear about versions 27-31). 27 was released in 2014. Debian/Wheezy has version 38, Debian/Squeeze has Iceweasel 3.5.16 which doesn’t support it. I think it is reasonable to assume that anyone who’s still using Squeeze is using it for a server given it’s age and the fact that LTS is based on packages related to being a server.

IE version 11 supports it and runs on Windows 7+ (all supported versions of Windows). IE 10 doesn’t support it and runs on Windows 7 and Windows 8. Are the free upgrades from Windows 7 to Windows 10 going to solve this problem? Do we want to support Windows 7 systems that haven’t been upgraded to the latest IE? Do we want to support versions of Windows that MS doesn’t support?

Windows mobile doesn’t have enough users to care about.

Opera supports it from version 17. This is noteworthy because Opera used to be good for devices running older versions of Android that aren’t supported by Chrome.

Safari supported it from iOS version 5, I think that’s a solved problem given the way Apple makes it easy for users to upgrade and strongly encourages them to do so.

Log Analysis

For many servers the correct thing to do before even discussing the issue is to look at the logs and see how many people use the various browsers. One problem with that approach on a Linux community site is that the people who visit the site most often will be more likely to use recent Linux browsers but older Windows systems will be more common among people visiting the site for the first time. Another issue is that there isn’t an easy way of determining who is a serious user, unlike for example a shopping site where one could search for log entries about sales.

I did a quick search of the Apache logs and found many entries about browsers that purport to be IE6 and other versions of IE before 11. But most of those log entries were from other countries, while some people from other countries visit the club web site it’s not very common. Most access from outside Australia would be from bots, and the bots probably fake their user agent.

Should We Do It?

Is breaking support for Debian/Squeeze, the built in Android browser on Android <5.0, and Windows 7 and 8 systems that haven’t upgraded IE as a web browsing platform a reasonable trade-off for implementing the best SSL security features?

For the LUV server as a stand-alone issue the answer would be no as the only really secret data there is accessed via ssh. For a general web infrastructure issue it seems that the answer might be yes.

I think that it benefits the community to allow members to test against server configurations that will become more popular in the future. After implementing changes in the server I can advise club members (and general community members) about how to configure their servers for similar results.

Does this outweigh the problems caused by some potential users of ancient systems?

I’m blogging about this because I think that the issues of configuration of community servers have a greater scope than my local LUG. I welcome comments about these issues, as well as about the SSL compatibility issues.

Related posts:

  1. Name Server IP and a Dead Server About 24 hours ago I rebooted the system that runs...
  2. Server Costs vs Virtual Server Costs The Claim I have seen it claimed that renting a...
  3. My Blog Server was Cracked On the 1st of August I noticed that the server...

Craige McWhirter: Haskell is Not For Production and Other Tales - Katie Miller - LCA2016

Tue, 2016-02-02 17:17

Katie Miller gave an excellent talk about Haskell covering:

  • Haskell is at the heart of Sigma at Facebook handling more than 1 million requests / second
  • Haxl is a Haskell framework in Sigma and is used for fighting malicious activity on Facebook
  • Haxl is open source.
  • Haskell's origins are academic.
  • Haskell is used widely in industry. Katie listed quite a few.
Why Haskell?
  • The ability to reason about code, from:
    • Purity
    • Strong static typing
    • Abstract away from concurrency
  • Haskell performs:
    • As much as 3x as fast as it's predecessor
    • 30 times as fast for the user experience.
The Myth of Haskell Being Difficult
  • Not necessarily difficult but it different and that result sin friction.
  • Expectations:
    • Concepts don't neatly to familiar languages.
    • It's a journey teaching you a new way to think.
  • Abstractions:
    • Abstract concepts can take time to get a handle on.
  • Type Errors
    • Can be a little difficult to explain to new users
    • Worth mastering due to increase of productivity.
Teaching Haskell
  • Didn't mention monads.
  • Stuck to notations and implicit concurrency.
  • Created conversation space to discuss issues.
  • Results exceeded expectations.
Hiring Difficulty
  • There are more people wanting to work in Haskell than there are Haskell jobs.
  • An embarrassment of talent riches.
Haskell Community Difficulty
  • Community may have forgotten how to communicate to new people.
  • Keep in mind what it as like to be new.
  • Documentation for libraries needs to improve.
  • Work to create diverse and welcoming community.
  • Be technical brutal but personally respectful.
  • Haskell is not immune to bad code.

"Using functional programming is a conclusion from a goal, not the goal itself" - Brian McKenna

  • The legacy of Haskell is spreading good ideas and this was an original goal.
  • Haskell's difference brings both benefits and challenges.

"Open Source = opportunity" - Katie Miller

Conclusion: Haskell is for production.

Simon Lyall: 2016 – Sysadmin Miniconf – Session 2

Tue, 2016-02-02 16:28

Site Reliability Engineering at Dropbox – Tammy Butow

  • Having a SLA, measuring against it. Caps OPSwork, Blameless Post Mortum, Always be coding
  • 400 M customer, billion files every day
  • Very hard to find people to scale, so build tool to scale instead
  • Team looks at 6,000 DB machines, look after whole machines not just the app
  • Build a lot of tools in python and go
  • PygerDuty – python library for pagerduty API
    • Easy to find the top things paging, write tools to reduce these
    • Regular weekly meeting to review those problems and make them better
    • If work is happening on machines then turn off monitoring on them so people don’t get woken up for things they don’t need to.
    • Going for days without any pages
  • Self-Healing and auto-remediation scripts
  • Hermes
    • Allocate and track tasks to groups
  • Automation of DB tasks
  • Bot copies pagerduty alerts in slack
  • Aim Higher
    • Create a roadmap for next 12 months
    • Buiding a rocketship while it is flying though the sky
  • Follow the Sun so people are working days
  • Post Mortem for every page
  • Frequent DR testing
  • Take time out to celebrate

I missed out writing up the next couple of talks due to technical problems



Tim Serong: A Gentle Introduction to Ceph

Tue, 2016-02-02 15:28

I told a little story about Ceph today in the sysadmin miniconf at 2016. Simon and Ewen have already put a copy of my slides up on the miniconf web site, but for bonus posterity you can get them from here also. For some more detail, check out Lars’ SUSE Enterprise Storage – Design and Performance for Ceph slides from SUSECon 2015 and Software-defined Storage: Sizing and Performance.

Update (2016-02-07): The video of the talk can be found here.