Planet Linux Australia

Syndicate content
Planet Linux Australia - http://planet.linux.org.au
Updated: 29 min 33 sec ago

Simon Lyall: DevOps Days Auckland 2017 – Wednesday Session 1

Wed, 2017-10-04 09:02

Michael Coté – Not actually a DevOps Talk

Digital Transformation

  • Goal: deliver value, weekly reliably, with small patches
  • Management must be the first to fail and transform
  • Standardize on a platform: special snow flakes are slow, expensive and error prone (see his slide, good list of stuff that should be standardize)
  • Ramping up: “Pilot low-risk apps, and ramp-up”
  • Pair programming/working
    • Half the advantage is people speed less time on reddit “research”
  • Don’t go to meetings
  • Automate compliance, have what you do automatic get logged and create compliance docs rather than building manually.
  • Crafting Your Cloud-Native Strategy

Sajeewa Dayaratne – DevOps in an Embedded World

  • Challenges on Embedded
    • Hardware – resource constrinaed
    • Debugging – OS bugs, Hardware Bugs, UFO Bugs – Oscilloscopes and JTAG connectors are your friend.
    • Environment – Thermal, Moisture, Power consumption
    • Deploy to product – Multi-month cycle, hard of impossible to send updates to ships at sea.
  • Principles of Devops , equally apply to embedded
    • High Frequency
    • Reduce overheads
    • Improve defect resolution
    • Automate
    • Reduce response times
  • Navico
    • Small Sonar, Navigation for medium boats, Displays for sail (eg Americas cup). Navigation displays for large ships
    • Dev around world, factory in Mexico
  • Codebase
    • 5 million lines of code
    • 61 Hardware Products supported – Increasing steadily, very long lifetimes for hardware
    • Complex network of products – lots of products on boat all connected, different versions of software and hardware on the same boat
  • Architecture
    • Old codebase
    • Backward compatible with old hardware
    • Needs to support new hardware
    • Desire new features on all products
  • What does this mean
    • Defects were found too late
    • Very high cost of bugs found late
    • Software stabilization taking longer
    • Manual test couldn’t keep up
    • Cost increasing , including opportunity cost
  • Does CI/CD provide answer?
    • But will it work here?
    • Case Study from HP. Large-Scale Agile Development by Gary Gruver
  • Our Plan
    • Improve tolls and archetecture
    • Build Speeds
    • Automated testing
    • Code quality control
  • Previous VCS
    • Proprietary tool with limit support and upgrades
    • Limited integration
    • Lack of CI support
    • No code review capacity
  • Move to git
    • Code reviews
    • Integrated CI
    • Supported by tools
  • Archetecture
    • Had a configurable codebase already
    • Fairly common hardware platform (only 9 variations)
    • Had runtime feature flags
    • But
      • Cyclic dependancies – 1.5 years to clean these up
      • Singletons – cut down
      • Promote unit testability – worked on
      • Many branches – long lived – mega merges
  • Went to a single Branch model, feature flags, smaller batch sizes, testing focused on single branch
  • Improve build speed
    • Start 8 hours to build Linux platform, 2 hours for each app, 14+ hours to build and package a release
    • Options
      • Increase speed
      • Parallel Builds
    • What did
      • ccache.clcache
      • IncrediBuild
      • distcc
    • 4-5hs down to 1h
  • Test automation
    • Existing was mock-ups of the hardware to not typical
    • Started with micro-test
      • Unit testing (simulator)
      • Unit testing (real hardware)
    • Build Tools
      • Software tools (n2k simulator, remote control)
      • Hardware tools ( Mimic real-world data, re purpose existing stuff)
    • UI Test Automation
      • Build or Buy
      • Functional testing vs API testing
      • HW Test tools
      • Took 6 hours to do full test on hardware.
  • PipeLine
    • Commit -> pull request
    • Automated Build / Unit Tests
    • Daily QA Build
  • Next?
    • Configuration as code
    • Code Quality tools
    • Simulate more hardware
    • Increase analytics and reporting
    • Fully simulated test env for dev (so the devs don’t need the hardware)
    • Scale – From internal infrastructure to the cloud
    • Grow the team
  • Lessons Learnt
    • Culture!
    • Collect Data
    • Get Executive Buy in
    • Change your tolls and processes if needed
    • Test automation is the key
      • Invest in HW
      • Silulate
      • Virtualise
    • Focus on good software design for Everything

Ben Martin: Ikea wireless charger in CNC mahogany case

Tue, 2017-10-03 21:41
I notice that Ikea sell their wireless chargers without a shell for insertion into desks. The "desk" I chose is a curve cut profile in mahogany that just happens to have the same fit as an LG G3/4/5 type phone. The design changed along the way to a more upright one which then required a catch to stop the phone sliding off.


This was done in Fusion360 which allows bringing in STL files of things like phones and cutting those out of another body. It took a while to work out the ball end toolpath but I finally worked out how to get something that worked reasonably well. The chomps in the side allow fingers to securely lift the phone off the charger.

It will be interesting to play with sliced objects in wood. Layering 3D cuts to build up objects that are 10cm (or about 4 layers) tall.

Simon Lyall: DevOps Days Auckland 2017 – Tuesday Session 3

Tue, 2017-10-03 15:02

Mirror, mirror, on the wall: testing Conway’s Law in open source communities – Lindsay Holmwood

  • The map between the technical organisation and the technical structure.
  • Easy to find who owns something, don’t have to keep two maps in your head
  • Needs flexibility of the organisation structure in order to support flexibility in a technical design
  • Conway’s “Law” really just adage
  • Complexity frequently takes the form of hierarchy
  • Organisations that mirror perform badly in rapidly changing and innovative enviroments

Metrics that Matter – Alison Polton-Simon (Thoughtworks)

  • Metrics Mania – Lots of focus on it everywhere ( fitbits, google analytics, etc)
  • How to help teams improve CD process
  • Define CD
    • Software consistently in a deployable state
    • Get fast, automated feedback
    • Do push-button deployments
  • Identifying metrics that mattered
    • Talked to people
    • Contextual observation
    • Rapid prototyping
    • Pilot offering
  • 4 big metrics
    • Deploy ready builds
    • Cycle time
    • Mean time between failures
    • Mean time to recover
  • Number of Deploy-ready builds
    • How many builds are ready for production?
    • Routine commits
    • Testing you can trust
    • Product + Development collaboration
  • Cycle Time
    • Time it takes to go from a commit to a deploy
    • Efficient testing (test subset first, faster testing)
    • Appropriate parallelization (lots of build agents)
    • Optimise build resources
  • Case Study
    • Monolithic Codebase
    • Hand-rolled build system
    • Unreliable environments ( tests and builds fail at random )
    • Validating a Pull Request can take 8 hours
    • Coupled code: isolated teams
    • Wide range of maturity in testing (some no test, some 95% coverage)
    • No understanding of the build system
    • Releases routinely delay (10 months!) or done “under the radar”
  • Focus in case study
    • Reducing cycle time, increasing reliability
    • Extracted services from monolith
    • Pipelines configured as code
    • Build infrastructure provisioned as docker and ansible
    • Results:
      • Cycle time for one team 4-5h -> 1:23
      • Deploy ready builds 1 per 3-8 weeks -> weekly
  • Mean time between failures
    • Quick feedback early on
    • Robust validation
    • Strong local builds
    • Should not be done by reducing number of releases
  • Mean time to recover
    • How long back to green?
    • Monitoring of production
    • Automated rollback process
    • Informative logging
  • Case Study 2
    • 1.27 million lines of code
    • High cyclomatic complexity
    • Tightly coupled
    • Long-running but frequently failing testing
    • Isolated teams
    • Pipeline run duration 10h -> 15m
    • MTTR Never -> 50 hours
    • Cycle time 18d -> 10d
    • Created a dashboard for the metrics
  • Meaningless Metrics
    • The company will build whatever the CEO decides to measure
    • Lines of code produced
    • Number of Bugs resolved. – real life duplicates Dilbert
    • Developers Hours / Story Points
    • Problems
      • Lack of team buy-in
      • Easy to agme
      • Unintended consiquences
      • Measuring inputs, not impacts
  • Make your own metrics
    • Map your path to production
    • Highlights pain points
    • collaborate
    • Experiment

 

Simon Lyall: DevOps Days Auckland 2017 – Tuesday Session 2

Tue, 2017-10-03 11:02

Using Bots to Scale incident Management – Anthony Angell (Xero)

  • Who we are
    • Single Team
    • Just a platform Operations team
  • SRE team is formed
    • Ops teams plus performance Engineering team
  • Incident Management
    • In Bad old days – 600 people on a single chat channel
    • Created Framework
    • what do incidents look like, post mortems, best practices,
    • How to make incident management easy for others?
  • ChatOps (Based on Hubot)
    • Automated tour guide
    • Multiple integrations – anything with Rest API
    • Reducing time to restore
    • Flexability
  • Release register – API hook to when changes are made
  • Issue report form
    • Summary
    • URL
    • User-ids
    • how many users & location
    • when started
    • anyone working on it already
    • Anything else to add.
  • Chat Bot for incident
    • Populates for an pushes to production channel, creates pagerduty alert
    • Creates new slack channel for incident
    • Can automatically update status page from chat and page senior managers
    • Can Create “status updates” which record things (eg “restarted server”), or “yammer updates” which get pushed to social media team
    • Creates a task list automaticly for the incident
    • Page people from within chat
    • At the end: Gives time incident lasted, archives channel
    • Post Mortum
  • More integrations
    • Report card
    • Change tracking
    • Incident / Alert portal
  • High Availability – dockerisation
  • Caching
    • Pageduty
    • AWS
    • Datadog

 

Simon Lyall: DevOps Days Auckland 2017 – Tuesday Session 1

Tue, 2017-10-03 09:02

DevSecOps – Anthony Rees

“When Anthrax and Public Enemy came together, It was like Developers and Operations coming together”

  • Everybody is trying to get things out fast, sometimes we forget about security
  • Structural efficiency and optimised flow
  • Compliance putting roadblock in flow of pipeline
    • Even worse scanning in production after deployment
  • Compliance guys using Excel, Security using Shell-scripts, Develops and Operations using Code
  • Chef security compliance language – InSpec
    • Insert Sales stuff here
  • ispec.io
  • Lots of pre-written configs available

Immutable SQL Server Clusters – John Bowker (from Xero)

  • Problem
    • Pet Based infrastructure
    • Not in cloud, weeks to deploy new server
    • Hard to update base infrastructure code
  • 110 Prod Servers (2 regions).
  • 1.9PB of Disk
  • Octopus Deploy: SQL Schemas, Also server configs
  • Half of team in NZ, Half in Denver
    • Data Engineers, Infrastructure Engineers, Team Lead, Product Owner
  • Where we were – The Burning Platform
    • Changed mid-Migration from dedicated instances to dedicated Hosts in AWS
    • Big saving on software licensing
  • Advantages
    • Already had Clustered HA
    • Existing automation
    • 6 day team, 15 hours/day due to multiple locations of team
  • Migration had to have no downtime
    • Went with node swaps in cluster
  • Split team. Half doing migration, half creating code/system for the node swaps
  • We learnt
    • Dedicated hosts are cheap
    • Dedicated host automation not so good for Windows
    • Discovery service not so good.
    • Syncing data took up to 24h due to large dataset
    • Powershell debugging is hard (moving away from powershell a bit, but powershell has lots of SQL server stuff built in)
    • AWS services can timeout, allow for this.
  • Things we Built
    • Lots Step Templates in Octopus Deploy
    • Metadata Store for SQL servers – Dynamite (Python, Labda, Flask, DynamoDB) – Hope to Open source
    • Lots of PowerShell Modules
  • Node Swaps going forward
    • Working towards making this completely automated
    • New AMI -> Node swap onto that
    • Avoid upgrade in place or running on old version

James Morris: Linux Security Summit 2017 Roundup

Mon, 2017-10-02 13:01

The 2017 Linux Security Summit (LSS) was held last month in Los Angeles over the 14th and 15th of September.  It was co-located with Open Source Summit North America (OSSNA) and the Linux Plumbers Conference (LPC).

LSS 2017

Once again we were fortunate to have general logistics managed by the Linux Foundation, allowing the program committee to focus on organizing technical content.  We had a record number of submissions this year and accepted approximately one third of them.  Attendance was very strong, with ~160 attendees — another record for the event.

LSS 2017 Attendees

On the day prior to LSS, attendees were able to access a day of LPC, which featured two tracks with a security focus:

Many thanks to the LPC organizers for arranging the schedule this way and allowing LSS folk to attend the day!

Realtime notes were made of these microconfs via etherpad:

I was particularly interested in the topic of better integrating LSM with containers, as there is an increasingly common requirement for nesting of security policies, where each container may run its own apparently independent security policy, and also a potentially independent security model.  I proposed the approach of introducing a security namespace, where all security interfaces within the kernel are namespaced, including LSM.  It would potentially solve the container use-cases, and also the full LSM stacking case championed by Casey Schaufler (which would allow entirely arbitrary stacking of security modules).

This would be a very challenging project, to say the least, and one which is further complicated by containers not being a first class citizen of the kernel.   This leads to security policy boundaries clashing with semantic functional boundaries e.g. what does it mean from a security policy POV when you have namespaced filesystems but not networking?

Discussion turned to the idea that it is up to the vendor/user to configure containers in a way which makes sense for them, and similarly, they would also need to ensure that they configure security policy in a manner appropriate to that configuration.  I would say this means that semantic responsibility is pushed to the user with the kernel largely remaining a set of composable mechanisms, in relation to containers and security policy.  This provides a great deal of flexibility, but requires those building systems to take a great deal of care in their design.

There are still many issues to resolve, both upstream and at the distro/user level, and I expect this to be an active area of Linux security development for some time.  There were some excellent followup discussions in this area, including an approach which constrains the problem space. (Stay tuned)!

A highlight of the TPMs session was an update on the TPM 2.0 software stack, by Philip Tricca and Jarkko Sakkinen.  The slides may be downloaded here.  We should see a vastly improved experience over TPM 1.x with v2.0 hardware capabilities, and the new software stack.  I suppose the next challenge will be TPMs in the post-quantum era?

There were further technical discussions on TPMs and container security during subsequent days at LSS.  Bringing the two conference groups together here made for a very productive event overall.

TPMs microconf at LPC with Philip Tricca presenting on the 2.0 software stack.

This year, due to the overlap with LPC, we unfortunately did not have any LWN coverage.  There are, however, excellent writeups available from attendees:

There were many awesome talks.

The CII Best Practices Badge presentation by David Wheeler was an unexpected highlight for me.  CII refers to the Linux Foundation’s Core Infrastructure Initiative , a preemptive security effort for Open Source.  The Best Practices Badge Program is a secure development maturity model designed to allow open source projects to improve their security in an evolving and measurable manner.  There’s been very impressive engagement with the project from across open source, and I believe this is a critically important effort for security.

CII Bade Project adoption (from David Wheeler’s slides).

During Dan Cashman’s talk on SELinux policy modularization in Android O,  an interesting data point came up:

Interesting data from the talk: 44% of Android kernel vulns blocked by SELinux due to attack surface reduction. https://t.co/FnU544B3XP

— James Morris (@xjamesmorris) September 15, 2017

We of course expect to see application vulnerability mitigations arising from Mandatory Access Control (MAC) policies (SELinux, Smack, and AppArmor), but if you look closely this refers to kernel vulnerabilities.   So what is happening here?  It turns out that a side effect of MAC policies, particularly those implemented in tightly-defined environments such as Android, is a reduction in kernel attack surface.  It is generally more difficult to reach such kernel vulnerabilities when you have MAC security policies.  This is a side-effect of MAC, not a primary design goal, but nevertheless appears to be very effective in practice!

Another highlight for me was the update on the Kernel Self Protection Project lead by Kees, which is now approaching its 2nd anniversary, and continues the important work of hardening the mainline Linux kernel itself against attack.  I would like to also acknowledge the essential and original research performed in this area by grsecurity/PaX, from which this mainline work draws.

From a new development point of view, I’m thrilled to see the progress being made by Mickaël Salaün, on Landlock LSM, which provides unprivileged sandboxing via seccomp and LSM.  This is a novel approach which will allow applications to define and propagate their own sandbox policies.  Similar concepts are available in other OSs such as OSX (seatbelt) and BSD (pledge).  The great thing about Landlock is its consolidation of two existing Linux kernel security interfaces: LSM and Seccomp.  This ensures re-use of existing mechanisms, and aids usability by utilizing already familiar concepts for Linux users.

Mickaël Salaün from ANSSI talking about his Landlock LSM work at #linuxsecuritysummit 2017 pic.twitter.com/wYpbHuLgm2

— LinuxSecuritySummit (@LinuxSecSummit) September 14, 2017

Overall I found it to be an incredibly productive event, with many new and interesting ideas arising and lots of great collaboration in the hallway, lunch, and dinner tracks.

Slides from LSS may be found linked to the schedule abstracts.

We did not have a video sponsor for the event this year, and we’ll work on that again for next year’s summit.  We have discussed holding LSS again next year in conjunction with OSSNA, which is expected to be in Vancouver in August.

We are also investigating a European LSS in addition to the main summit for 2018 and beyond, as a way to help engage more widely with Linux security folk.  Stay tuned for official announcements on these!

Thanks once again to the awesome event staff at LF, especially Jillian Hall, who ensured everything ran smoothly.  Thanks also to the program committee who review, discuss, and vote on every proposal, ensuring that we have the best content for the event, and who work on technical planning for many months prior to the event.  And of course thanks to the presenters and attendees, without whom there would literally and figuratively be no event :)

See you in 2018!

 

OpenSTEM: Stone Axes and Aboriginal Stories from Victoria

Mon, 2017-10-02 11:04
In the most recent edition of Australian Archaeology, the journal of the Australian Archaeological Association, there is a paper examining the exchange of stone axes in Victoria and correlating these patterns of exchange with Aboriginal stories in the 19th century. This paper is particularly timely with the passing of legislation in the Victorian Parliament on […]