Skip to content

Commit 15e5f36

Browse files
update readme
1 parent 1d0ed98 commit 15e5f36

File tree

2 files changed

+228
-196
lines changed

2 files changed

+228
-196
lines changed

README.md

+23-196
Original file line numberDiff line numberDiff line change
@@ -1,205 +1,32 @@
1-
# Instrumenting Elixir with Telemetry + LiveView
2-
## Resources
3-
https://blog.smartlogic.io/instrumenting-with-telemetry/
4-
https://github.com/beam-telemetry/telemetry
5-
https://hexdocs.pm/plug/Plug.Telemetry.html
1+
# Quantum
62

7-
## Outline
8-
* Telemetry plug instruments request duration
9-
* Attach request `[:my, :app, :start/:stop]` to a handler
10-
* That handler sends messages to async reporters, this case LiveView - could also be StatsD, Prometheus, log statement, etc. This is where we talk about async processes.
11-
* LiveView handles the event and updates something in the UI
12-
* Add another point of instrumentation, maybe a metric for successful/failed external API requests
13-
* Attach a handler for that event, report to LV, etc.
14-
* Instrumenting Ecto query times?
3+
Quantum is a dummy Phoenix app used to illustrate instrumentation with Telemetry.
154

16-
## UI with Chartkick
17-
https://github.com/buren/chartkick-ex
18-
https://jacobburenstam.com/chartkick-ex/
19-
https://github.com/buren/chartkick-phoenix-example
5+
## Up and Running
206

21-
## Implementation Plan
7+
* Clone down this repo
8+
* `cd` into the repo and run `mix deps.get`
9+
* Then, run `npm install --prefix ./assets`
10+
* Start the Phoenix server: `mix phx.server`
2211

23-
* Simple Phoenix app with endpoints:
24-
- [X] Landing page
25-
- [] Simple auth flow (sign up/sign in)
26-
- [X] User schema and module, account context
27-
- [X] Log in, sign up, log out, user show
28-
- [ ] Set up Telemetry + handler
29-
- [ ] Set up LiveView to receive messages from Telemetry handler, display simple count for num logins
30-
* Metric increment for num logins -> chart, count
31-
* Metric increment success/failure for logins -> chart
32-
* Query duration for find and create queries -> chart
33-
* Telemetry plug for landing page load time (add a random sleep between 1 and 5 seconds) -> chart
12+
To run with StatsD so that you can see your metrics processed, follow the StatsD installation instructions [here](https://anomaly.io/statsd-install-and-config/index.html).
3413

14+
## Learn More
3515

36-
# Instrumenting LiveView w Telemetry
37-
We know how to take advantage of the telemetry plug to measure request times. But what about client/server interaction that does not occur over HTTP? As we use LV for more and more real-time features, how can we instrument WS communication duration in a sane and scalable manner?
16+
Check out the blog series, Instrumenting Phoenix with Telemetry, here:
3817

39-
* We have a telemetry plug to automatically get request duration but no such thing for instrumentin LV traffic
40-
* Build something that gets LV message -> render duration
41-
* Using telemetry to report to 3rd party or another LV with charting lib
42-
* Use an approach like this to define a macro that will execute any telemetry calls and then execute defined function body https://carlo-colombo.github.io/2017/06/10/Track-functions-call-in-Elixir-applications-with-Google-Analytics/
18+
* Part I: Telemetry Under The Hood
19+
* Part II: Handling Telemetry Events with `TelemetryMetrics` + `TelemetryMetricsStatsd`
20+
* Part III: Observing Phoenix + Ecto Telemetry Events
21+
* Part IV: Erlang VM Measurements with `telemetry_poller`, `TelemetryMetrics` + `TelemetryMetricsStatsd`
4322

44-
Inspired by: https://github.com/elixir-plug/plug/blob/master/lib/plug/telemetry.ex#L76
23+
### Follow Along With The Code
4524

46-
```ruby
47-
defmeasured handle_event(event, payload, socket) do
48-
# should do the equivalent of:
49-
start_time = System.monotonic_time()
50-
prefix = #{String.downcase(__MODULE__}).#{event}
51-
opts = [] # any tags?
52-
telemetry
53-
.execute("#{prefix}.start", %{time: start_time},%{socket: socket, options: opts})
54-
socket = assign(socket, %{telemetry_event_prefix: prefix})
55-
# execute body with new socket
56-
end
57-
58-
defmeasured render(assigns) do
59-
# should do the equivalent of:
60-
duration = System.monotonic_time() - start_time
61-
prefix = assigns.telemetry_event_prefix
62-
opts = [] # any tags?
63-
:telemetry
64-
.execute("#{prefix}.stop", %{duration: duration}, %{conn: conn, options: opts})
65-
socket = assigns(socket, :telemetry_event_prefix, nil)
66-
# execute body with assigns
67-
end
68-
69-
def execute_before_render_callbacks(assigns) do
70-
assigns.before_render()
71-
# |> invoke function body with updated assigns
72-
end
73-
```
74-
75-
* Macro needs to switch on function type -> `handle_event` or `render`. Should just call to function if any other type.
76-
* Don't need to register per-event b/c LV process will only work on one event at a time. So we are sure that render is for the event that we just registered a process for. But we should track event name, only for reporting, otherwise how do we know which event it is that we just checked and reported duration for.
77-
* No need to clear prefix from socket assigns before rendering b/c it will update as soon as next event is received? What about `handle_info` tho? Assume we will measure that too. Either you're using telemetry to measure duration of _all_ incoming messages or you're not. No way to enforce this tho :( Better to clear prefix though and not assume that every message is instrumented.
78-
* Have to manually attach Telemetry event handler for each telemetry event. Either we attach once for the LV module start/stop and use tags to be more granular about event type or user is on the hook for attaching handler for each event's start/stop. I'm leaning towards option 1 but have to play around with tags more first.
79-
80-
81-
## Implementation Plan
82-
83-
* Simple live view with three events--three buttons that you click to change color and each one has a sleep for a diff amount of time.
84-
* Instrument duration of "request/response" for each event type, register LV telemetry handler to receive telemetry events, that handler can send to our dashboard LV. Question: Same telemetry event handler for all events or separate for LV vs. application? I.e. separation of concerns with telemetry event handling modules or just one giant one?
85-
* prob. want to play around with metric label names and tags
86-
87-
## Instrumenting Phoenix with Telemetry + StatsD
88-
89-
### Resources
90-
91-
* https://hexdocs.pm/telemetry_metrics/Telemetry.Metrics.html
92-
* https://hexdocs.pm/telemetry_metrics_statsd/TelemetryMetricsStatsd.html
93-
* https://github.com/beam-telemetry/telemetry_metrics_statsd/issues/15
94-
95-
### Next Steps
96-
97-
* Can the reporter support Dogstatsd events? Can we hack it?
98-
- [X] Which telemetry events is Phoenix/Ecto/etc emitted for us for free?
99-
100-
- [X] Run statsd to view output for each of the mapped metrics
101-
* [Installing statsd for mac](https://anomaly.io/statsd-install-and-config/index.html)
102-
103-
* Verify how to report the "free" metrics you can hook into:
104-
(remember summary == timing/duration)
105-
* [List all](https://til.hashrocket.com/posts/o17nfvwzbo--list-all-telemetry-event-handlers)
106-
* HTTP request duration by route
107-
* [Source](https://github.com/phoenixframework/phoenix/blob/d4596650df21e7e0603debcb5f2ad25eb9ac082d/lib/phoenix/router.ex)
108-
* [HTTP request count by route](https://app.datadoghq.com/dashboard/dmk-qzr-9sf/sophies-timeboard-22-mar-2020-1214?from_ts=1584882280954&fullscreen_section=overview&fullscreen_widget=4612552948460480&live=true&tile_size=m&to_ts=1584896680954&fullscreen_start_ts=1584882486335&fullscreen_end_ts=1584896886335&fullscreen_paused=false)
109-
* [Ecto query duration by query](https://app.datadoghq.com/metric/explorer?live=true&page=0&is_auto=false&from_ts=1584889953332&to_ts=1584893553332&tile_size=m&exp_metric=quantum.repo.query.total_time.count&exp_scope=command%3Aselect%2Csource%3Ausers&exp_agg=avg&exp_row_type=metric&fullscreen=1011)
110-
* [Ecto query count by query](https://app.datadoghq.com/metric/summary?filter=quantum.repo.query&metric=quantum.repo.query.count)
111-
* VM metrics (last_count == gauge) (need polling)
112-
* Live View?
113-
* [Channel joined](https://github.com/phoenixframework/phoenix/blob/8a4aa4eed0de69f94ab09eca157c87d9bd204168/lib/phoenix/channel/server.ex#L319)
114-
* [Channel handle_in](https://github.com/phoenixframework/phoenix/blob/8a4aa4eed0de69f94ab09eca157c87d9bd204168/lib/phoenix/channel/server.ex#L319)
115-
* Reporting custom metrics to StatsD
116-
* Emit telemetry event and define corresponding metric in Telemetry module
117-
* Extending Telemetry to support Datadog events
118-
* Hook into Telemetry event with custom handler
119-
* Configuring global tags and + prefixes: https://hexdocs.pm/telemetry_metrics_statsd/TelemetryMetricsStatsd.html#module-global-tags
120-
* Benefits:
121-
* Abstract away common instrumentation needs and automatically send such events to your reporter of choice.
122-
* Can still define custom handlers for events and do more stuff with them
123-
124-
## TODO
125-
126-
- [X] Success/failure web request response instrumentation
127-
* LiveView metrics with channel joined and channel handled_in -> can't be done OOTB, blog post should explain, show channel source, link to LV issue
128-
* Three custom metrics:
129-
* Worker polling
130-
* Custom event polling
131-
* Telemetry plug
132-
- [X] LiveView handle event duration and timer
133-
- [X] VM metrics with polling
134-
* Visualize DD reporting by using DD formatter but running regular statsd, grab log statement from error message
135-
136-
## Notes
137-
* We're instrumenting for free:
138-
* Database query duration and counts
139-
* HTTP request duration and counts
140-
* VM metrics
141-
* Telemetry event handling for free with Telemetry metrics module--can emit any event with `:telemetry.execute` (is this Erlang??) and don't need to define and attach custom handle module.
142-
143-
## Blog Post
144-
* What is observability? What is instrumentation?
145-
* Common needs: web requests, database queries
146-
* Show the DIY - define an event + module, attach, custom log in handler module to report, log, etc. This might be a good place to look under the hood at ETS.
147-
* Reporter calls `telemetry.attach`
148-
* Look in `telemetry.erl`:
149-
* attach stores handler modules with associated events in ETS
150-
* execute looks up the handler for the event in ETS and invokes it
151-
* This is all abstracted away with Telemetry metrics!
152-
* OOTB instrumentation with Elixir Telemetry
153-
* We'll get web requests, database queries, VM monitoring
154-
* Implementation
155-
* Use Telemetry package
156-
* Establish module that defines which events you are listening to--this attaches them to the default handler.
157-
* Go through all of the OOTB events and link to source code
158-
* Look at source code in Phoenix that emits those telemetry events.
159-
* Tagging - slice up HTTP requests by contoller + action; DB queries by source and command. Tags become part of metric name in standard statsd formatting. Custom tag values functions
160-
* Note on Datadog formatter
161-
* Tags translate into metric tags (show the mapping)
162-
* Can leverage prefix, global tags, HTTP route tag now more usefully
163-
* Custom instrumentation -> not necessary, any event can be handled by one Telemetry module importing `Telemetry.Metrics`
164-
* Instrumentating LiveView with Phoenix's OOTB Telemetry events - CAN'T! Worth noting and comparing to Phoenix channel OOTB telemetry events, link to issue.
165-
* Custom duration and count instrumentation for
166-
* Telemetry under the hood - trace the flow of Phoenix/Ecto/app code emitting event and telemetry looking up event handle and calling it. Look at tags, etc.
167-
168-
### Questions
169-
- [X] How to instrument success/failure of web requests?
170-
* Use render errors event? https://github.com/phoenixframework/phoenix/blob/00a022fbbf25a9d0845329161b1bc1a192c2d407/lib/phoenix/endpoint/render_errors.ex
171-
- Refactoring Telemetry module--where does it live, can we break out into sub-modules, do we need a context, etc.
172-
173-
### Ecto Telemetry Event Source Code
174-
* https://github.com/elixir-ecto/ecto/blob/2aca7b28eef486188be66592055c7336a80befe9/lib/ecto/repo.ex#L95
175-
176-
## To Do
177-
* Post 1: Intro to Telemetry in Elixir (covers: intro to obs, getting starting with hand-rolled approach, Telemetry under the hood)
178-
* What is observability/why do we need it? What's so great about getting it with Telemetry lib?
179-
* DIY metrics with Telemetry lib -> start with dummy Quantum app and emit event for every sign up (counter and duration)
180-
* Define handler with callback. That callback does some reporting to StatsD, but can dummy this up.
181-
* Attach handler to event
182-
* Execute event
183-
* Under the hood
184-
* Telemetry attach adds to ETS
185-
* Telemetry execute looks up handler in ETS and invokes it
186-
* We need abstraction! Right now, we hand-rolled:
187-
* Handler module definition and callback
188-
* Reporting code
189-
* Calls to attach
190-
* Even our call to execute seems kind of onerous--plenty of stuff that _everyone_ would want to instrument (HTTP request counts and durations, look at success/failure responses, Ecto query times)
191-
* Elixir abstracts a lot of this away!
192-
* Lots of OOTB events emitted--baked in telemetry events executed from Phoenix and Ecto source code and provides a family
193-
* No need to define custom handlers, reporting logic and enact attach calls thanks to Elixir's family of Telemetry libs--metrics, polling, reporters.
194-
* Post 2: OTTB Instrumentation with Telemetry Metrics, Polling and Reporters (covers OOTB instrumentation, usage of reporters, adding "custom" events with little effort or custom code)
195-
* Up and running:
196-
* Define module that uses telemetry metrics
197-
* Declare which OOTB events you will listen to in your `metrics` function
198-
* Start supervisor with Statsd reporter, VM polling in application.ex
199-
* Closer look at events
200-
* Each event source code, map execute to metric func, view in statsd and dogstatsd
201-
* Under the hood to see that reporter calls attach, stores its own module name with event
202-
* Telemetry calls execute, which looks up handler and invokes `handle_event`
203-
* Reporter's `handle_event` contains all the statsd/udp logic, uses `metrics` struct definitions to format metrics for statsd and sends traffic
204-
205-
Where to put custom event section? How to sequence "closer look at events" vs. "metrics + reporter under the hood"? Better to see it wired all up and then closer look at events maybe? Maybe keep the hand-rolled sign-in event but get rid of the custom module and attachment call, instead move that into new telemetry module. Then show it all wired up, including looks under the hood. Then replace with OOTB metrics, link to source code, list all helpful metrics. Maybe leave out LV entirely.
25+
* [Part I starting state branch](https://github.com/SophieDeBenedetto/quantum/tree/part-1-start)
26+
* [Part I solution branch](https://github.com/SophieDeBenedetto/quantum/tree/part-1-solution)
27+
* [Part II starting state branch](https://github.com/SophieDeBenedetto/quantum/tree/part-2-start)
28+
* [Part II solution branch](https://github.com/SophieDeBenedetto/quantum/tree/part-2-solution)
29+
* [Part III starting state branch](https://github.com/SophieDeBenedetto/quantum/tree/part-3-start)
30+
* [Part III solution branch](https://github.com/SophieDeBenedetto/quantum/tree/part-3-solution)
31+
* [Part IV starting state branch](https://github.com/SophieDeBenedetto/quantum/tree/part-4-start)
32+
* [Part IV solution branch](https://github.com/SophieDeBenedetto/quantum/tree/part-4-solution)

0 commit comments

Comments
 (0)