Skip to content

Comments

Fix missing private clauses in 3D viscous GPU loops#1225

Open
sbryngelson wants to merge 7 commits intoMFlowCode:masterfrom
sbryngelson:fix/viscous-3d-gpu-private
Open

Fix missing private clauses in 3D viscous GPU loops#1225
sbryngelson wants to merge 7 commits intoMFlowCode:masterfrom
sbryngelson:fix/viscous-3d-gpu-private

Conversation

@sbryngelson
Copy link
Member

@sbryngelson sbryngelson commented Feb 21, 2026

User description

Summary

  • The 3D shear stress and bulk stress GPU_PARALLEL_LOOP directives were missing rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from their private clauses
  • The corresponding 2D loops already included these variables
  • Without privatization, these variables can cause race conditions when running on GPU

Test plan

  • Verify 3D viscous test cases produce correct results on GPU
  • No golden file changes expected (fixes GPU race condition, CPU results unchanged)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Performance
    • Improved GPU parallelization for viscous stress calculations, reducing race-condition risk and improving throughput and stability in multi-dimensional simulations. No changes to public interfaces or simulation behavior; results remain consistent while GPU-enabled runs are more reliable and often faster.

CodeAnt-AI Description

Fix GPU data races in 3D viscous loops by privatizing loop indices and temporaries

What Changed

  • The 3D shear- and bulk-stress GPU parallel regions now privately scope the loop indices (i, j, k, l, q) and per-thread viscous temporaries (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum and related temporaries), preventing those variables from being shared across GPU threads
  • Privatization was added consistently across all affected 3D viscous loops so GPU runs do not suffer from cross-thread interference
  • CPU execution and numerical results remain unchanged

Impact

✅ Correct 3D viscous results on GPU
✅ Fewer GPU race-condition artifacts in viscous simulations
✅ No change to CPU outputs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@codeant-ai

This comment has been minimized.

@coderabbitai

This comment has been minimized.

@codeant-ai codeant-ai bot added the size:XS This PR changes 0-9 lines, ignoring generated files label Feb 21, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

@codeant-ai

This comment has been minimized.

@codeant-ai

This comment has been minimized.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a GPU data-race hazard in the 3D viscous stress kernels by aligning their GPU_PARALLEL_LOOP privatization with the already-correct 2D implementations.

Changes:

  • Add missing mixture variables (rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) to the private clauses for the 3D shear-stress and bulk-stress GPU parallel loops.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/simulation/m_viscous.fpp (1)

321-321: LGTM — consider adding q to the private clause to be fully explicit

The added variables (i, j, k, l, rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum) correctly mirror the 2D shear-stress loop at line 104 and fix the GPU race condition.

The integer q (declared at line 83) is used as the induction variable of the inner do q = 1, Re_size(i) seq sub-loop (lines 392–396). While OpenACC treats subroutine-local scalars as implicitly private in a parallel region, the project guideline calls for all loop-local variables to be listed explicitly. The same gap exists in the 2D counterpart at line 104.

💡 Suggested addition
-$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

(The same addition should be applied to the 2D shear/bulk loops at lines 104 and 214, and to the 3D bulk loop at line 430.)

Based on learnings: "Ensure private(...) declarations on all loop-local variables in GPU-accelerated code to prevent unintended data sharing."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_viscous.fpp` at line 321, Add the induction variable q to
the private clause of the GPU_PARALLEL_LOOP directives to explicitly mark it as
loop-local: update the existing GPU_PARALLEL_LOOP(collapse=3, private='[...]')
that governs the 3D shear-stress loop (and the analogous GPU_PARALLEL_LOOP
directives used for the 2D shear and bulk loops and the 3D bulk loop) so that
'q' is included among the private variables alongside
i,j,k,l,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re;
this ensures the inner "do q = 1, Re_size(i)" seq sub-loop uses a private
induction variable and satisfies the project guideline for explicit loop-local
private declarations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/simulation/m_viscous.fpp`:
- Line 430: The OpenACC/OpenMP GPU_PARALLEL_LOOP directive missing the
inner-loop induction variable causes incorrect privatization; update the
directive that currently lists private='[i,j,k,l,rho_visc, gamma_visc,
pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]' to
also include the inner seq-loop induction variable q so it matches the 2D
bulk-stress loop’s private clause and the inner seq loop that uses q (the inner
loop around lines 501–505 referenced in the review).

---

Nitpick comments:
In `@src/simulation/m_viscous.fpp`:
- Line 321: Add the induction variable q to the private clause of the
GPU_PARALLEL_LOOP directives to explicitly mark it as loop-local: update the
existing GPU_PARALLEL_LOOP(collapse=3, private='[...]') that governs the 3D
shear-stress loop (and the analogous GPU_PARALLEL_LOOP directives used for the
2D shear and bulk loops and the 3D bulk loop) so that 'q' is included among the
private variables alongside
i,j,k,l,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re;
this ensures the inner "do q = 1, Re_size(i)" seq sub-loop uses a private
induction variable and satisfies the project guideline for explicit loop-local
private declarations.

@codecov
Copy link

codecov bot commented Feb 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.05%. Comparing base (df28255) to head (9b5101e).
⚠️ Report is 20 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1225   +/-   ##
=======================================
  Coverage   44.05%   44.05%           
=======================================
  Files          70       70           
  Lines       20496    20496           
  Branches     1989     1989           
=======================================
  Hits         9030     9030           
  Misses      10328    10328           
  Partials     1138     1138           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 22, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 22, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 22, 2026

CodeAnt AI Incremental review completed.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 23, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI Incremental review completed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/simulation/m_viscous.fpp (1)

105-105: Minor: trailing space before comma in alpha_visc_sum ,alpha_visc.

Lines 105 and 215 have alpha_visc_sum ,alpha_visc (space before the comma), which is inconsistent with the standard alpha_visc_sum, alpha_visc formatting used in the corresponding 3D blocks at lines 322 and 431.

✏️ Proposed fix (both occurrences)
-$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum ,alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')
+$:GPU_PARALLEL_LOOP(collapse=3, private='[i,j,k,l,q,rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum, alpha_visc, alpha_rho_visc, Re_visc, tau_Re]')

Also applies to: 215-215

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_viscous.fpp` at line 105, Fix the minor formatting typo in
the GPU_PARALLEL_LOOP private list: remove the stray space before the comma so
the variables read "alpha_visc_sum, alpha_visc" (consistent with the 3D blocks);
update both occurrences where "alpha_visc_sum ,alpha_visc" appears (lines around
the GPU_PARALLEL_LOOP uses) so the private clause lists "alpha_visc_sum,
alpha_visc" exactly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/simulation/m_viscous.fpp`:
- Line 105: Fix the minor formatting typo in the GPU_PARALLEL_LOOP private list:
remove the stray space before the comma so the variables read "alpha_visc_sum,
alpha_visc" (consistent with the 3D blocks); update both occurrences where
"alpha_visc_sum ,alpha_visc" appears (lines around the GPU_PARALLEL_LOOP uses)
so the private clause lists "alpha_visc_sum, alpha_visc" exactly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e892dc and 164cda3.

📒 Files selected for processing (1)
  • src/simulation/m_viscous.fpp

@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from 164cda3 to 9b3ee97 Compare February 23, 2026 14:48
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 23, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI Incremental review completed.

@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 23, 2026
@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from c979c72 to 6ba8b18 Compare February 24, 2026 02:35
@github-actions
Copy link

Claude Code Review

Head SHA: 6ba8b18

Files changed: 1

  • src/simulation/m_viscous.fpp (+4 / -4)

Summary

  • Fixes missing private clause variables in four GPU_PARALLEL_LOOP directives in the viscous stress computation routines
  • The 2D shear and bulk stress loops (guarded by num_dims > 1) were each missing q from their private lists, where q is the inner loop iterator over Reynolds number components
  • The 3D shear and bulk stress loops (guarded by num_dims > 2) were severely under-specified, missing i, j, k, l, q, rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from private lists
  • Without these fixes, missing private variables cause GPU race conditions where multiple threads share the same scalar intermediate storage — producing incorrect results silently on GPU
  • After the fix, all four loops share an identical, complete private list: [i,j,k,l,q,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum,alpha_visc,alpha_rho_visc,Re_visc,tau_Re]

Findings

No correctness issues or CLAUDE.md violations found. The changes are correct and complete.

Improvement opportunities:

  1. Whitespace inconsistency in 2D private lists (src/simulation/m_viscous.fpp, lines ~103 and ~213): The 2D loop private lists retain a pre-existing space-before-comma: alpha_visc_sum ,alpha_visc, while the newly written 3D lists correctly use alpha_visc_sum, alpha_visc. This PR was a good opportunity to normalize the formatting, since ./mfc.sh format may not touch Fypp macro string arguments.

  2. Slightly inaccurate PR description (src/simulation/m_viscous.fpp, lines ~103 and ~213): The PR description states "The 2D loops already included these variables," but the diff shows that the 2D loops also required a fix — adding q to their private lists. The commit message ("Add missing q to GPU_PARALLEL_LOOP private lists") is accurate; the body description is not.

  3. No GPU-specific regression test coverage (toolchain/mfc/test/cases.py): The race conditions fixed here are invisible in CPU runs (no output change expected), which means the test suite's golden-file approach cannot catch regressions of this kind. A 3D viscous GPU smoke test that verifies numerical results on GPU vs. CPU would provide a safety net for future private clause omissions in m_viscous.fpp.

@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from 6ba8b18 to dd51d68 Compare February 24, 2026 16:03
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 24, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 24, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 24, 2026

CodeAnt AI Incremental review completed.

sbryngelson and others added 3 commits February 24, 2026 11:49
The 3D shear stress and bulk stress GPU parallel loops were missing
rho_visc, gamma_visc, pi_inf_visc, and alpha_visc_sum from their
private clauses. The corresponding 2D loops already had these
variables listed. Without privatization, these variables could cause
race conditions on GPU.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add i,j,k,l to the private list for the 3D shear_stress and
bulk_stress GPU parallel loops, matching the pattern already used
by the analogous 2D loops at lines 105 and 215.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The sequential loop iterator q (used in `do q = 1, Re_size(i)`) was not
privatized in any of the four GPU parallel regions. Without explicit
privatization, q is shared across GPU threads on OpenACC and AMD OpenMP
backends, causing a data race in Reynolds number computation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sbryngelson sbryngelson force-pushed the fix/viscous-3d-gpu-private branch from 4893bc3 to 69b0e93 Compare February 24, 2026 16:49
danieljvickers
danieljvickers previously approved these changes Feb 24, 2026
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 24, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 24, 2026
@github-actions
Copy link

Claude Code Review

Head SHA: 7fcb80a

Files changed: 1

  • src/simulation/m_viscous.fpp

Summary:

  • Fixes missing private clause variables in 3D GPU parallel loops for viscous stress calculations
  • The 2D loops (guarded by num_dims > 1) already had the full private clause; the 3D loops (guarded by num_dims > 2) were missing i,j,k,l,q,rho_visc,gamma_visc,pi_inf_visc,alpha_visc_sum
  • Without these in the private clause, multiple GPU threads would share these variables, causing race conditions and nondeterministic results in 3D viscous simulations on GPU
  • The fix is symmetric: all four affected loops (shear + bulk, 2D and 3D) now have identical private variable lists
  • Also fixes a trailing-space inconsistency (alpha_visc_sum ,alpha_viscalpha_visc_sum, alpha_visc) in the 2D loops

Findings:

Correctness — LGTM. The diff is clearly correct. All loop-local scalars used inside the do l/k/j nests must be private on GPU to avoid data races. The 2D loops already had the full list; the 3D loops were silently broken on GPU.

Completeness check: Looking at the loop bodies for the 3D shear/bulk stress regions (lines ~319–430), variable q is used as a loop index (iterating over fluid components or Re entries) and rho_visc, gamma_visc, pi_inf_visc, alpha_visc_sum are temporaries accumulated per cell — all correctly added to the private clause.

No issues found. This is a straightforward, well-scoped bug fix. The change is minimal, targeted, and consistent with the existing 2D pattern.

Improvement opportunities (optional):

  1. The test plan item "Verify 3D viscous test cases produce correct results on GPU" is a manual step — a regression test covering 3D viscous GPU execution would catch this class of bug automatically in future.
  2. Consider a code comment near these GPU_PARALLEL_LOOP directives noting that the private list must be kept in sync between 2D and 3D loop versions, to help future reviewers catch this kind of divergence.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 24, 2026

CodeAnt AI Incremental review completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

2 participants