Tuesday

Went to Elastic{ON} morning conference thing at the Hilton today. The first part of the conference was about the new features in “version 5” of everything (Elasticsearch, Logstash, Beats, Kibana and all the pay for stuff).

Notes:

Beats packet analysis

  • Support for DB, Redis, Mongo etc…
  • Beats multiline support
  • Logstash clustering

Elasticsearch Updates

  • Resiliency (failures, security, balancing etc…)
  • Partition tolerance
    • Using UUID’s for transactions
  • Operations improvements
    • query profiler
    • reindex API
    • task manager
  • Analytics
    • Doc values
      • massive scale for aggregates enabled by column DB stored on disk
      • not run on Java heap
  • Time series maths
    • derivations
    • moving averages
    • variance
  • Prediction/anomaly detection (pipeline aggregations)
  • Auto index shard and archival
  • Can specify filters in ES directly, don’t need LogStash (pipelines)

Kibana

  • Kibana logs to ES
  • Global timezone support

Shield

  • document and field level security
  • audit logging
  • paid plugin

Alerting (Watcher)

  • alerting/monitoring
  • Slack, HipChat, JIRA, PGDuty
  • Contextual thresholds
  • Chained inputs
  • No UI (but coming)
  • paid plugin

Marvel (ES cluster monitoring)

  • Is open source now

New Graph analytics

  • paid plugin

Reporting

  • Time based triggers
  • Event based triggers
  • to send PDF report or picture of snapshot to someone/thing
  • paid plugin

Elastic Cloud

  • Starts at $45 per month
  • hosted elastic stack
  • Operated by ES engineers (only one that can claim that)
  • All X-Pack features (paid for stuff)
  • Uses AWS and is in 7 regions (and expanding)
  • Shitty admin UI (but works)
  • Auto backups (snapshots)
  • “Enterprise” cloud available (for self hosting)

The second part of the conference was basically two customers getting up on stage and doing a talk. I also learned a new term “MTTI” which stands for “Mean Time to Investigation”. The first of the customers was Damian Harvey from Deloitte talking about how they use Elastic stack. Notes below:

  • he organizes the API & Microservices meetup
  • Elastic stack is a part of the representative Architecture
  • started as a selfish goal (easier to get logs to diagnose issues)
  • add on business value/metrics
  • they use MuleSoft (put Mule in front of legacy systems and a REST/JSON microservices layer on top of that, exposing API)
  • talked about request tracking through microservices (if it doesn’t have a “tracking id” any service should add one)
  • Daily rolling indexe means shard sizes not an issue
  • Lots of RAM, use instances with 64GB, give JVM 32GB
  • 3 Node clusters
  • Kibana setup/config with customer/IT
  • Log pipelines are the hard part
    • Lots of legacy systems
    • Need to decouple shipping and indexing
  • Tip to use message broker in front of ES to not lose messages when it goes down
  • Talked about designing “log pipelines”
  • For message broker, can use Redis, ActiveMQ, RabbitMQ, Kafka
  • They’re moving everything to Kafka
  • New “Ingest” node in ES5 could replace LogStash
  • Don’t like LogStash, moving to Beats
  • Need to decouple for ES maintenance
  • Need Curator to roll data (they just keep a days worth, don’t care about shard sizes)
  • Chorus case study
    • Use MuleSoft Cloud Hub for hosting
    • Can’t hit “legacy” systems with “web scale”
    • Data on which versions of API is important (when you need to sunset people)
    • Lamda’s call Cloud Hub API’s, pull data and inject into ES
  • New services brought on board inherit Monitoring, Logging etc…
  • Tip: For LogStash use immutable Docker containers (all changes must go through VCS)
  • Tip: Use consistent Grok log formats (auto grok for ES, based off filenames)
  • Tip: Kibana naming conventions
  • Tip: Use message broker

The final talk was from a very enthusiastic man of South African descent, whose name I can’t remember from Cigna, where he was the BI team lead. Talking about using ElasticSearch for BI. Notes below:

  • needed Datawarehouse
  • Needed Visualization
  • Needed explorability of data (drill down)
  • Helps drive strategy and decision making
  • AS400 systems (delta loading)
  • All fed into SQL Server (source of truth)
  • ELK pulls/populated from SQL Server
  • All new reports used to go through Oracle DBA
    • had to write PL SQL for each new report
  • 2.2mil vs 100k
  • Team consists of SQL Server DBA’s
  • Use python for ingestion instead of logstash (probably on the mainframes)

Did 3 x 10 pushups

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.