Advice for Success as a Technology Geek
http://menno.io | firstname.lastname@example.org | @mjs0
## Questions & Comments?! Feel free to interrupt me
# I'm not a Data Scientist! (but I'll try to make this relevant)
## Currently: Cacophony Project * Technology to deal with NZ's invasive predator problem - possums, mustelids, rats, etc * Embedded systems, computer vision, machine learning, web apps & control systems * Open source and non-profit * We're hiring!
# Advice for Success * Some things that I've learned * Hopefully some of these are new to you Note: * I had many more ideas * Tried to pick the more important, interesting & unique ones
# Always Be Learning Note: * most important advice I'm going to give today * The best technology people I've worked with are the ones who are always pushing to understand things better.
## Always Be Learning - Why? * More perspectives ➡ more effective * Problem solving power * It's fun! * Don't have to know everything about a topic - knowing that something exists is often enough - can always review later
## Ways to learn * Stay current with the platforms you use * Podcasts, blogs, Twitter, professional organisations - ACM is great * Lean on experienced people around * User groups
## More ways to learn * Expose yourself to new challenges - say "yes" more often - that hard project no one wants to touch * Be curious * Look beneath the level of abstraction you're at Note: For Data Science, this might mean digging into the algorithms, databases, tools & data sources you use.
## Learning is hard * Enjoy the learning progress * Lean into difficulty * It's about perspective Note: It's easy for me to say all this but...
## "I don't know" is ok * Nobody can know everything * Don't be afraid to say "I don't understand" * "Can you explain?" is a powerful question Note: * Even the most experienced people can know or understand everything. * Use these moments as opportunities improve understanding * In discussions & meetings, you're probably not the only one
# Open Source
## Contributing to Open Source * Learning beyond your course/job - different approaches & technologies - more people to learn from * Rewarding & good for society * Expands career options
## Open Source & Employers * Employers love candidates with open source experience * Experience even when you're junior * Demonstrates interest beyond the 9 to 5 * Real world work samples * Shows how you work with others * Many companies rely on open source Note: * When hiring, candidates with a solid open source record usually get onto the shortlist. * Some of my roles have come directly as a result of open source contributions.
## Getting Started * Find a project that interests you * Based on technology that you're familiar with * Start small - scratch an itch * Be polite
The Value of Other Roles
## You will be working with other people! (people with other job titles)
## For example... * Sales * Marketing * Business development * System adminstration * Finance * Management
## Sometimes these people ... * ... have annoying or strange requests * ... appear to be insane
## Be patient * There's usually a good reason * Listen * Ask questions * Find the underlying need * Highlight what's possible
## Working together * They need you & you need them * Success depends on cooperation * Small things can have big impact - example: automating marketing reports - example: streamlining invoice generation * Automate where possible
## Fun fact You will be working more hours than you're paid for Note: * It's the reality of the technology industry * Estimation is extremely difficult * Unpredictable challenges
## Working long hours * Long hours are sometimes unavoidable * But... - Not sustainable - Not effective * Lots of studies show this Note: * Occasional long days are OK * Slippery slope
## Impact - More mistakes - poor judgement - Loss of drive & creativity - Impact on family & social life - Burn out & mental health issues - Higher staff turnover
## Take care of yourself * Physically - walk, run, yoga, gym... * Mentally * sleep is important * do other things * social interaction
## Employers * Some "get it", some don't * Be prepared to walk away * Often rooted in company culture - hard to change
## Work/life balance examples Note: * Canonical * BATS * Resolver
# Data Pipelines / Streaming Note: Shift gears
## Streaming * Process your data as a pipeline * Data processed in small units - records / lines / chunks * Useful pattern at small & large scales ![pipeline](pipeline.png "Pipeline")
## Why? * Reduced memory requirements * Easier to understand, maintain & test * Often faster * Distribution * Concurrency
## Typical Pipeline Operations * Modify fields * Add new fields ("annotate") * Remove unneed fields * Filter
## Advanced Operations * Split * Concatenate/chain * Interleave
## Pipeline Example ![pipeline example](pipeline-example.png "Pipeline Example")
## In Python * Iterators and generators * Small functions for each transformation * `itertools` and `map()` are your friends * Look for iterator based data source and sink APIs * `heapq.merge()` for merging large datasets Note: Anecdote about interleaving large sorted datasets from many machines at BATS using heapq.merge.
## Streaming at Scale * Apache Spark * Kafka * Distributed system architectures * Event sourcing
## That's all Questions?