Skip to content

Monotonic time#1126

Closed
pboling wants to merge 6 commits intosimplecov-ruby:mainfrom
VitalConnectInc:monotonic-time
Closed

Monotonic time#1126
pboling wants to merge 6 commits intosimplecov-ruby:mainfrom
VitalConnectInc:monotonic-time

Conversation

@pboling
Copy link
Copy Markdown

@pboling pboling commented Mar 7, 2025

Important background: Monotonic time is the "correct" way to calculate elapsed time in Ruby:
https://blog.dnsimple.com/2018/03/elapsed-time-with-ruby-the-right-way/

I do not understand the rubocop failure in CI, because when I run locally there are no violations:

❯ bundle exec rubocop -a
Inspecting 111 files
...............................................................................................................

111 files inspected, no offenses detected

Same result when run with bundle exec rake rubocop as it is in CI.

CC @ilyazub

pboling added 4 commits March 6, 2025 17:37
Previously some gems masked a bug in Rails v6 and v7 where `logger` was not required,
  plugging the hole by requiring it themselves.
Because this app is pinned to Rails v6.1, we must require logger manually.
In order to not be reliant on other libraries to fix the bug in Rails for us, we do it too.
See: https://stackoverflow.com/questions/79360526/uninitialized-constant-activesupportloggerthreadsafelevellogger-nameerror
- explicit proxy for determining if running in parallel
- `created_at` / `timestamp` is for wall clock time (for humans)
- `started_at` is for monotonic / elapsed time (for computers)
@pboling
Copy link
Copy Markdown
Author

pboling commented Mar 7, 2025

This is now ready for review @amatsuda @PragTob

@pboling
Copy link
Copy Markdown
Author

pboling commented Jun 13, 2025

Note: the rubocop failure will be fixed once @simi's work in #1131 is merged, and thus it will be a completely green build!
Ping @amatsuda @PragTob.

Copy link
Copy Markdown
Collaborator

@sferik sferik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

I added some minor concerns that I left comments about but I have a bigger concern with the core approach: monotonic time doesn’t work across processes.

Monotonic clock values are per-process. They represent something like the process’s view of system uptime and have no shared epoch between processes. The started_at value gets serialized to .resultset.json and then read back in time_since_result_creation by a different process (the one doing the merge). When that process computes current_monotonic - stored_started_at, the two values come from different processes with different monotonic origins, so the result is meaningless. This could cause valid results to be incorrectly expired or stale results to be kept.

The cross-process merge path is actually one place where wall clock time is the right choice, since it’s a clock with a shared epoch. If the concern is Time.now being stubbed by Timecop, switching created_at to use Process.clock_gettime(Process::CLOCK_REALTIME), solves that without needing monotonic time.

Would you consider a version of this PR scoped to just that change?

Comment thread lib/simplecov/timer.rb
class Timer
# Monotonic clock: Process::CLOCK_MONOTONIC
# Wall clock: Process::CLOCK_REALTIME
attr_accessor :start_time, :clock
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be attr_reader, since there’s no reason to mutate them after construction.

Comment thread lib/simplecov.rb
# and for scenarios where the normal ENV variables may not get set,
# use an explicit trigger defined internally: SIMPLECOV_PARALLEL
return true if ENV.fetch("SIMPLECOV_PARALLEL", nil)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This environment variable is introduced but not documented, tested, and doesn’t guard against "false" being truthy.

Comment thread Gemfile.lock

BUNDLED WITH
2.7.0.dev
2.6.5
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a local artifact that shouldn’t be committed.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be updated to not be a pre-release version of bundler!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been update to 4.0.9 in the main branch. Feel free to rebase.

@pboling
Copy link
Copy Markdown
Author

pboling commented Mar 28, 2026

@sferik

monotonic time doesn’t work across processes.

I just looked into this a bit deeper, because it surprised me to hear it. What you say is true of some programming languages, including notably Python. It is not generally true of Ruby, nor of system level OS calls in POSIX systems.

  • System-Wide by Default: On all modern POSIX-compliant systems (Linux, macOS, BSD), CLOCK_MONOTONIC is system-wide. Ruby simply wraps the OS system call, meaning Process A and Process B will receive the same time value at the same moment.
  • Shared Epoch: The "origin" (time zero) is system-wide, and is typically the system boot time. Because both processes share this same origin, comparing values between them is valid.
  • In the Ruby documentation this is hedged, for a valid, albeit perhaps edge case, reason, because on non-POSIX compliant systems the origin may vary from the POSIX standard, like old versions of Windows.

Exceptions:

  • If simplecov timings are to be compared across different servers, then the monotonic clock values would indeed lose all significance.

I don't know if this changes your view. If it doesn't, then I'm happy to switch it back to wall clock - though I expect it could reintroduce the race condition I built this to work around.

Perhaps another option would be to allow the user to configure the timer either way. Then cross-server setups have meaningful timings. Worth noting that if a suite is run across servers it almost certainly implies long running suites. On the other hand, if a suite runs on a single server in different processes there is a decent chance of them executing extremely fast, increasing the liklihood of running into the classic issues with wall clock time math.

What do you think of supporting both?

@pboling
Copy link
Copy Markdown
Author

pboling commented Mar 28, 2026

Found this interesting documentation:

Note: Time calculations on all platforms and languages are sensitive to changes to the system clock. To alleviate the potential problems associated with changing the system clock while an application is running, most modern operating systems provide a monotonic clock that operates independently of the system clock. A monotonic clock cannot be used to determine human-friendly clock times. A monotonic clock is used exclusively for calculating time intervals. Not all Ruby platforms provide access to an operating system monotonic clock. On these platforms a pure-Ruby monotonic clock will be used as a fallback. An operating system monotonic clock is both faster and more reliable than the pure-Ruby implementation. The pure-Ruby implementation should be fast and reliable enough for most non-realtime operations. At this time the common Ruby platforms that provide access to an operating system monotonic clock are MRI 2.1 and above and JRuby (all versions).

https://ruby-concurrency.github.io/concurrent-ruby/master/Concurrent/ScheduledTask.html#:~:text=Note:,loosely%20based%20on%20Java's%20ScheduledExecutorService.

@sferik
Copy link
Copy Markdown
Collaborator

sferik commented Mar 28, 2026

I stand corrected on monotonic time. I’m not sure why I had a mistaken understanding of how it worked. Thank you for bringing me up-to-speed.

Despite being right about that, I still feel like this pull request is over-engineered for the actual problem. If the main goal is to avoid Timecop interference, Process.clock_gettime(Process::CLOCK_REALTIME) is a one-line fix that doesn’t require a new class, a new serialized field, fallback logic, etc. I believe just that change would provide 90% of the benefit with none of the complexity. What if just started with that?

Note, I don’t believe this change would resolve #559 or #815. I believe those issues are caused by a process synchronization race condition in the parallel_tests integration. Here is what I think is happening:

  1. Multiple test processes finish at different times.
  2. Each process reads .resultset.json, merges what’s there with its own results, and writes the merged report.
  3. The process that checks coverage thresholds (via at_exit) may read the resultset before other processes finish writing their results to it.

@pboling
Copy link
Copy Markdown
Author

pboling commented Mar 28, 2026

I'll submit a new PR with the simpler approach!

@pboling pboling closed this Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants