[NVVM] Support - Followup enhancements by abhilash1910 · Pull Request #1218 · NVIDIA/cuda-python

abhilash1910 · 2025-11-05T02:17:01Z

Description

Issue Link - #981

Changes to be addressed in this WIP PR:

LTO IR testing
Is there a way to add multiple modules?
{If / when it is possible to add multiple modules, a test with code that uses something from libdevice is probably a good idea.
It's also useful to be able to lazily add a module}
apply bitcode pattern input for libnvvm

cc @leofang

copy-pr-bot · 2025-11-05T02:17:05Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

leofang · 2025-11-17T04:12:03Z

Thanks, @abhilash1910! Any ETA to wrap this up?

abhilash1910 · 2025-11-25T14:50:56Z

pre-commit.ci autofix

abhilash1910 · 2025-11-25T18:15:22Z

pre-commit.ci autofix

leofang

Thanks, Abhilash! Leaving a few early feedbacks.

cuda_core/cuda/core/experimental/_program.py

leofang · 2025-11-25T18:42:09Z

cuda_core/tests/test_program.py

+    bitcode_path = os.environ.get("BITCODE_NVVM_PATH")
+    if not bitcode_path:
+        pytest.skip("BITCODE_NVVM_PATH environment variable is not set.Disabling the test.")
+    bitcode_file = Path(bitcode_path)
+    if not bitcode_file.exists():
+        pytest.skip(f"Bitcode file not found: {bitcode_path}")
+
+    if bitcode_file.suffix != ".bc":
+        pytest.skip(f"Expected .bc file, got: {bitcode_file.suffix}")


Would it be possible for us to avoid having a file locally? We have bitcode in this repo already:

cuda-python/cuda_bindings/tests/test_nvvm.py

Lines 12 to 141 in b9c76b3

MINIMAL_NVVMIR_TXT_TEMPLATE = b"""\

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-i128:128:128-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"

target triple = "nvptx64-nvidia-cuda"

define void @kernel() {

entry:

ret void

}

!nvvm.annotations = !{!0}

!0 = !{void ()* @kernel, !"kernel", i32 1}

!nvvmir.version = !{!1}

!1 = !{i32 %d, i32 0, i32 %d, i32 0}

""" # noqa: E501

MINIMAL_NVVMIR_BITCODE_STATIC = {

(1, 3): # (major, debug_major)

"4243c0de3514000005000000620c30244a59be669dfbb4bf0b51804c01000000210c00007f010000"

"0b02210002000000160000000781239141c80449061032399201840c250508191e048b62800c4502"

"42920b42641032143808184b0a3232884870c421234412878c1041920264c808b1142043468820c9"

"01323284182a282a90317cb05c9120c3c8000000892000000b0000003222c80820624600212b2498"

"0c212524980c19270c85a4906032645c20246382a01801300128030173046000132677b00778a007"

"7cb0033a680377b0877420877408873618877a208770d8e012e5d006f0a0077640077a600774a007"

"7640076d900e71a00778a00778d006e980077a80077a80076d900e7160077a100776a0077160076d"

"900e7320077a300772a0077320076d900e7640077a600774a0077640076d900e71200778a0077120"

"0778a00771200778d006e6300772a0077320077a300772d006e6600774a0077640077a600774d006"

"f6100776a0077160077a100776d006f6300772a0077320077a300772d006f6600774a0077640077a"

"600774d006f610077280077a10077280077a10077280076de00e7160077a300772a0077640071a21"

"4c0e11de9c2e4fbbcfbe211560040000000000000000000000000620b141a0e86000004016080000"

"06000000321e980c19114c908c092647c6044362098c009401000000b1180000ac0000003308801c"

"c4e11c6614013d88433884c38c4280077978077398710ce6000fed100ef4800e330c421ec2c11dce"

"a11c6630053d88433884831bcc033dc8433d8c033dcc788c7470077b08077948877070077a700376"

"788770208719cc110eec900ee1300f6e300fe3f00ef0500e3310c41dde211cd8211dc2611e663089"

"3bbc833bd04339b4033cbc833c84033bccf0147660077b6807376887726807378087709087706007"

"76280776f8057678877780875f08877118877298877998812ceef00eeee00ef5c00eec300362c8a1"

"1ce4a11ccca11ce4a11cdc611cca211cc4811dca6106d6904339c84339984339c84339b8c3389443"

"3888033b94c32fbc833cfc823bd4033bb0c30cc7698770588772708374680778608774188774a087"

"19ce530fee000ff2500ee4900ee3400fe1200eec500e3320281ddcc11ec2411ed2211cdc811edce0"

"1ce4e11dea011e66185138b0433a9c833bcc50247660077b68073760877778077898514cf4900ff0"

"500e331e6a1eca611ce8211ddec11d7e011ee4a11ccc211df0610654858338ccc33bb0433dd04339"

"fcc23ce4433b88c33bb0c38cc50a877998877718877408077a28077298815ce3100eecc00ee5500e"

"f33023c1d2411ee4e117d8e11dde011e6648193bb0833db4831b84c3388c4339ccc33cb8c139c8c3"

"3bd4033ccc48b471080776600771088771588719dbc60eec600fede006f0200fe5300fe5200ff650"

"0e6e100ee3300ee5300ff3e006e9e00ee4500ef83023e2ec611cc2811dd8e117ec211de6211dc421"

"1dd8211de8211f66209d3bbc433db80339948339cc58bc7070077778077a08077a488777708719cb"

"e70eef300fe1e00ee9400fe9a00fe530c3010373a8077718875f988770708774a08774d087729881"

"844139e0c338b0433d904339cc40c4a01dcaa11de0411edec11c662463300ee1c00eec300fe9400f"

"e5000000792000001d000000721e482043880c19097232482023818c9191d144a01028643c313242"

"8e9021a318100a00060000006b65726e656c0000230802308240042308843082400c330c4230cc40"

"0c4441c84860821272b3b36b730973737ba30ba34b7b739b1b2528d271b3b36b4b9373b12b939b4b"

"7b731b2530000000a9180000250000000b0a7228877780077a587098433db8c338b04339d0c382e6"

"1cc6a10de8411ec2c11de6211de8211ddec11d1634e3600ee7500fe1200fe4400fe1200fe7500ef4"

"b08081077928877060077678877108077a28077258709cc338b4013ba4833d94c3026b1cd8211cdc"

"e11cdc201ce4611cdc201ce8811ec2611cd0a11cc8611cc2811dd861c1010ff4200fe1500ff4800e"

"00000000d11000000600000007cc3ca4833b9c033b94033da0833c94433890c30100000061200000"

"06000000130481860301000002000000075010cd14610000000000007120000003000000320e1022"

"8400fb020000000000000000650c00001f000000120394f000000000030000000600000006000000"

"4c000000010000005800000000000000580000000100000070000000000000000c00000013000000"

"1f000000080000000600000000000000700000000000000000000000010000000000000000000000"

"060000000000000006000000ffffffff00240000000000005d0c00000d0000001203946700000000"

"6b65726e656c31352e302e376e7670747836342d6e76696469612d637564613c737472696e673e00"

"00000000",

(2, 3): # (major, debug_major)

"4243c0de3514000005000000620c30244a59be669dfbb4bf0b51804c01000000210c000080010000"

"0b02210002000000160000000781239141c80449061032399201840c250508191e048b62800c4502"

"42920b42641032143808184b0a3232884870c421234412878c1041920264c808b1142043468820c9"

"01323284182a282a90317cb05c9120c3c8000000892000000b0000003222c80820624600212b2498"

"0c212524980c19270c85a4906032645c20246382a01801300128030173046000132677b00778a007"

"7cb0033a680377b0877420877408873618877a208770d8e012e5d006f0a0077640077a600774a007"

"7640076d900e71a00778a00778d006e980077a80077a80076d900e7160077a100776a0077160076d"

"900e7320077a300772a0077320076d900e7640077a600774a0077640076d900e71200778a0077120"

"0778a00771200778d006e6300772a0077320077a300772d006e6600774a0077640077a600774d006"

"f6100776a0077160077a100776d006f6300772a0077320077a300772d006f6600774a0077640077a"

"600774d006f610077280077a10077280077a10077280076de00e7160077a300772a0077640071a21"

"4c0e11de9c2e4fbbcfbe211560040000000000000000000000000620b141a0286100004016080000"

"06000000321e980c19114c908c092647c60443620914c10840190000b1180000ac0000003308801c"

"c4e11c6614013d88433884c38c4280077978077398710ce6000fed100ef4800e330c421ec2c11dce"

"a11c6630053d88433884831bcc033dc8433d8c033dcc788c7470077b08077948877070077a700376"

"788770208719cc110eec900ee1300f6e300fe3f00ef0500e3310c41dde211cd8211dc2611e663089"

"3bbc833bd04339b4033cbc833c84033bccf0147660077b6807376887726807378087709087706007"

"76280776f8057678877780875f08877118877298877998812ceef00eeee00ef5c00eec300362c8a1"

"1ce4a11ccca11ce4a11cdc611cca211cc4811dca6106d6904339c84339984339c84339b8c3389443"

"3888033b94c32fbc833cfc823bd4033bb0c30cc7698770588772708374680778608774188774a087"

"19ce530fee000ff2500ee4900ee3400fe1200eec500e3320281ddcc11ec2411ed2211cdc811edce0"

"1ce4e11dea011e66185138b0433a9c833bcc50247660077b68073760877778077898514cf4900ff0"

"500e331e6a1eca611ce8211ddec11d7e011ee4a11ccc211df0610654858338ccc33bb0433dd04339"

"fcc23ce4433b88c33bb0c38cc50a877998877718877408077a28077298815ce3100eecc00ee5500e"

"f33023c1d2411ee4e117d8e11dde011e6648193bb0833db4831b84c3388c4339ccc33cb8c139c8c3"

"3bd4033ccc48b471080776600771088771588719dbc60eec600fede006f0200fe5300fe5200ff650"

"0e6e100ee3300ee5300ff3e006e9e00ee4500ef83023e2ec611cc2811dd8e117ec211de6211dc421"

"1dd8211de8211f66209d3bbc433db80339948339cc58bc7070077778077a08077a488777708719cb"

"e70eef300fe1e00ee9400fe9a00fe530c3010373a8077718875f988770708774a08774d087729881"

"844139e0c338b0433d904339cc40c4a01dcaa11de0411edec11c662463300ee1c00eec300fe9400f"

"e5000000792000001e000000721e482043880c19097232482023818c9191d144a01028643c313242"

"8e9021a318100a00060000006b65726e656c0000230802308240042308843082400c23080431c320"

"04c30c045118858c04262821373bbb36973037b737ba30bab437b7b95102231d373bbbb6343917bb"

"32b9b9b437b7518203000000a9180000250000000b0a7228877780077a587098433db8c338b04339"

"d0c382e61cc6a10de8411ec2c11de6211de8211ddec11d1634e3600ee7500fe1200fe4400fe1200f"

"e7500ef4b08081077928877060077678877108077a28077258709cc338b4013ba4833d94c3026b1c"

"d8211cdce11cdc201ce4611cdc201ce8811ec2611cd0a11cc8611cc2811dd861c1010ff4200fe150"

"0ff4800e00000000d11000000600000007cc3ca4833b9c033b94033da0833c94433890c301000000"

"6120000006000000130481860301000002000000075010cd14610000000000007120000003000000"

"320e10228400fc020000000000000000650c00001f000000120394f0000000000300000006000000"

"060000004c000000010000005800000000000000580000000100000070000000000000000c000000"

"130000001f0000000800000006000000000000007000000000000000000000000100000000000000"

"00000000060000000000000006000000ffffffff00240000000000005d0c00000d00000012039467"

"000000006b65726e656c31352e302e376e7670747836342d6e76696469612d637564613c73747269"

"6e673e0000000000",

}

@pytest.fixture(params=("txt", "bitcode_static"))

def minimal_nvvmir(request):

major, minor, debug_major, debug_minor = nvvm.ir_version()

if request.param == "txt":

return MINIMAL_NVVMIR_TXT_TEMPLATE % (major, debug_major)

bitcode_static_binascii = MINIMAL_NVVMIR_BITCODE_STATIC.get((major, debug_major))

if bitcode_static_binascii:

return binascii.unhexlify(bitcode_static_binascii)

raise RuntimeError(

"Static bitcode for NVVM IR version "

f"{major}.{debug_major} is not available in this test.\n"

"Maintainers: Please run the helper script to generate it and add the "

"output to the MINIMAL_NVVMIR_BITCODE_STATIC dict:\n"

" ../../toolshed/build_static_bitcode_input.py"

)

so I suggest that we move it to the common place, say a new file under cuda_python_test_helpers:
https://github.com/NVIDIA/cuda-python/tree/main/cuda_python_test_helpers/cuda_python_test_helpers
and have it imported in both cuda.bindings/core tests.

Agree on this.

This should be partially addressed now as I have yet to remove the existing invocations from cuda_bindings test

cuda_core/cuda/core/experimental/_program.py

abhilash1910 · 2025-12-03T07:55:41Z

pre-commit.ci autofix

cuda_core/cuda/core/experimental/_program.py

brandon-b-miller · 2026-02-06T20:27:33Z

hi @abhilash1910, @rwgk, what do you think of this patch as a basis for some testing? This uses a temporary dir and creates the expected directory structure for each discovery method within it. The wheel test patches out the base site-packages dir that's expected, and the conda and CUDA_HOME methods control what's visible through either the $CONDA_PREFIX or $CUDA_HOME env vars respectively.

Details

diff --git a/cuda_pathfinder/tests/test_find_libdevice.py b/cuda_pathfinder/tests/test_find_libdevice.py
new file mode 100644
index 000000000..2d24f397f
--- /dev/null
+++ b/cuda_pathfinder/tests/test_find_libdevice.py
@@ -0,0 +1,94 @@
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+import os
+
+import pytest
+
+from cuda.pathfinder import find_libdevice
+from cuda.pathfinder._dynamic_libs import find_libdevice as find_libdevice_module
+
+FILENAME = "libdevice.10.bc"
+
+SITE_PACKAGES_REL_DIR_CUDA12 = "nvidia/cuda_nvcc/nvvm/libdevice"
+SITE_PACKAGES_REL_DIR_CUDA13 = "nvidia/cuda_nvvm/nvvm/libdevice"
+
+
+@pytest.fixture
+def clear_find_libdevice_cache():
+    find_libdevice.cache_clear()
+    yield
+    find_libdevice.cache_clear()
+
+
+def _make_libdevice_file(dir_path: str) -> str:
+    os.makedirs(dir_path, exist_ok=True)
+    file_path = os.path.join(dir_path, FILENAME)
+    with open(file_path, "wb"):
+        pass
+    return file_path
+
+
+@pytest.mark.parametrize("rel_dir", [SITE_PACKAGES_REL_DIR_CUDA12, SITE_PACKAGES_REL_DIR_CUDA13])
+@pytest.mark.usefixtures("clear_find_libdevice_cache")
+def test_find_libdevice_via_site_packages(monkeypatch, mocker, tmp_path, rel_dir):
+    libdevice_dir = tmp_path.joinpath(*rel_dir.split("/"))
+    expected_path = str(_make_libdevice_file(str(libdevice_dir)))
+
+    mocker.patch.object(
+        find_libdevice_module,
+        "find_sub_dirs_all_sitepackages",
+        return_value=[str(libdevice_dir)],
+    )
+    monkeypatch.delenv("CONDA_PREFIX", raising=False)
+    monkeypatch.delenv("CUDA_HOME", raising=False)
+    monkeypatch.delenv("CUDA_PATH", raising=False)
+
+    result = find_libdevice()
+
+    assert result == expected_path
+    assert os.path.isfile(result)
+
+
+# same for cu12/cu13
+@pytest.mark.usefixtures("clear_find_libdevice_cache")
+def test_find_libdevice_via_conda(monkeypatch, mocker, tmp_path):
+    rel_path = os.path.join("nvvm", "libdevice")
+    libdevice_dir = tmp_path / rel_path
+    expected_path = str(_make_libdevice_file(str(libdevice_dir)))
+
+    mocker.patch.object(find_libdevice_module, "IS_WINDOWS", False)
+    mocker.patch.object(
+        find_libdevice_module,
+        "find_sub_dirs_all_sitepackages",
+        return_value=[],
+    )
+    monkeypatch.setenv("CONDA_PREFIX", str(tmp_path))
+    monkeypatch.delenv("CUDA_HOME", raising=False)
+    monkeypatch.delenv("CUDA_PATH", raising=False)
+
+    result = find_libdevice()
+
+    assert result == expected_path
+    assert os.path.isfile(result)
+
+
+@pytest.mark.usefixtures("clear_find_libdevice_cache")
+def test_find_libdevice_via_cuda_home(monkeypatch, mocker, tmp_path):
+    rel_path = os.path.join("nvvm", "libdevice")
+    libdevice_dir = tmp_path / rel_path
+    expected_path = str(_make_libdevice_file(str(libdevice_dir)))
+
+    mocker.patch.object(
+        find_libdevice_module,
+        "find_sub_dirs_all_sitepackages",
+        return_value=[],
+    )
+    monkeypatch.delenv("CONDA_PREFIX", raising=False)
+    monkeypatch.setenv("CUDA_HOME", str(tmp_path))
+    monkeypatch.delenv("CUDA_PATH", raising=False)
+
+    result = find_libdevice()
+
+    assert result == expected_path
+    assert os.path.isfile(result)

rwgk · 2026-02-10T16:07:57Z

hi @abhilash1910, @rwgk, what do you think of this patch as a basis for some testing? This uses a temporary dir and creates the expected directory structure for each discovery method within it. The wheel test patches out the base site-packages dir that's expected, and the conda and CUDA_HOME methods control what's visible through either the $CONDA_PREFIX or $CUDA_HOME env vars respectively.

This looks good to me. I convinced myself that the suggested code is pytest-xdist compatible (see below).

@abhilash1910 I believe @brandon-b-miller cannot push to this PR unless you give him push permission to this branch in your fork, but we could clone the commits here to a new branch/PR and develop the find_libdevice code & tests there. What do you think will be best?

Cursor analysis:

monkeypatch is function‑scoped and restores os.environ at teardown, so within a worker process there’s no leakage between tests.
xdist runs tests in separate worker processes, so env mutations are isolated per worker.
The only real shared state is the @functools.cache on find_libdevice, and the fixture clear_find_libdevice_cache already clears it before/after each test, which prevents order dependence.

So yes, those env edits are xdist‑compatible as written.

abhilash1910 · 2026-02-10T17:02:04Z

Yes @brandon-b-miller let me know if you received the invite. The change looks good to me as well.

brandon-b-miller · 2026-02-11T15:52:13Z

Thanks @abhilash1910 I hope you don't mind I've pushed a few commits here hopefully addressing some of the remaining reviews. Thanks again for this implementation!

@rwgk could you take another look when you get a chance?

abhilash1910 · 2026-02-11T16:35:01Z

Thanks @brandon-b-miller for the help :)

abhilash1910 · 2026-02-12T04:46:20Z

pre-commit.ci autofix

abhilash1910 · 2026-02-12T05:02:32Z

@rwgk @leofang @brandon-b-miller @kkraus14 requesting review. Thanks

abhilash1910 · 2026-02-12T11:45:48Z

pre-commit.ci autofix

leofang

No issues with the cuda.core part. We'd need more thoughts on the pathfinder, though.

leofang · 2026-02-12T14:47:59Z

cuda_core/tests/test_program.py

+@nvvm_available
+@pytest.mark.parametrize(
+    "options",
+    [
+        ProgramOptions(name="ltoir_test1", arch="sm_90", device_code_optimize=False),
+        ProgramOptions(name="ltoir_test2", arch="sm_100", link_time_optimization=True),
+        ProgramOptions(
+            name="ltoir_test3",
+            arch="sm_90",
+            ftz=True,
+            prec_sqrt=False,
+            prec_div=False,
+            fma=True,
+            device_code_optimize=True,
+            link_time_optimization=True,
+        ),
+    ],
+)
+def test_nvvm_program_options_ltoir(init_cuda, nvvm_ir, options):
+    """Test NVVM programs for LTOIR with different options"""
+    program = Program(nvvm_ir, "nvvm", options)
+    assert program.backend == "NVVM"
+
+    ltoir_code = program.compile("ltoir")
+    assert isinstance(ltoir_code, ObjectCode)
+    assert ltoir_code.name == options.name
+    code_content = ltoir_code.code
+    assert len(code_content) > 0
+    program.close()


Q: Can this test be combined with the one above (test_nvvm_program_options) and parametrized over target=("ptx", "ltoir")?

leofang · 2026-02-12T14:48:58Z

cuda_core/cuda/core/_program.pyx

+        # Add extra modules if provided
+        if options.extra_sources is not None:
+            if not is_sequence(options.extra_sources):
+                raise TypeError(
+                    "extra_sources must be a sequence of 2-tuples: ((name1, source1), (name2, source2), ...)"
+                )
+            for i, module in enumerate(options.extra_sources):
+                if not isinstance(module, tuple) or len(module) != 2:
+                    raise TypeError(
+                        f"Each extra module must be a 2-tuple (name, source)"
+                        f", got {type(module).__name__} at index {i}"
+                    )
+
+                module_name, module_source = module
+
+                if not isinstance(module_name, str):
+                    raise TypeError(f"Module name at index {i} must be a string, got {type(module_name).__name__}")
+
+                if isinstance(module_source, str):
+                    # Textual LLVM IR - encode to UTF-8 bytes
+                    module_source = module_source.encode("utf-8")
+                elif not isinstance(module_source, (bytes, bytearray)):
+                    raise TypeError(
+                        f"Module source at index {i} must be str (textual LLVM IR), bytes (textual LLVM IR or bitcode), "
+                        f"or bytearray, got {type(module_source).__name__}"
+                    )
+
+                if len(module_source) == 0:
+                    raise ValueError(f"Module source for '{module_name}' (index {i}) cannot be empty")


Not a blocker, we can handle it in the next PR. The option validation should be moved to under ProgramOptions.

cuda_pathfinder/cuda/pathfinder/_static_libs/find_libdevice.py

leofang · 2026-02-12T14:55:54Z

cuda_pathfinder/cuda/pathfinder/__init__.py

+from cuda.pathfinder._static_libs.find_libdevice import (
+    find_libdevice as find_libdevice,
+)
+from cuda.pathfinder._static_libs.find_libdevice import (
+    get_libdevice_path as get_libdevice_path,
+)


I feel libdevice is not special enough to have its own functions. How about we consolidate these with find_nvidia_binary_utility? @rwgk WDYT?

Hm, .bc (bitcode library) seems conceptually very different than e.g. nvcc (executable).

I think cuda/pathfinder/_static_libs is a better fit, compared to cuda/pathfinder/_binaries.

For the API, to mirror what we have for headers, how about:

locate_bitcode_lib("device") to get something similar to LocatedHeaderDir

find_bitcode_lib("device") to get just the abs_path

For other static libs there could be locate_static_lib("cudart").

So we'd be lumping the locate_bitcode_lib and locate_static_lib implementations under _static_lib, but that'd be a hidden implementation detail.

leofang · 2026-02-12T14:56:12Z

cuda_pathfinder/cuda/pathfinder/_static_libs/find_libdevice.py

+FILENAME = "libdevice.10.bc"
+if IS_WINDOWS:
+    bases = [r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA", r"C:\CUDA"]
+else:
+    bases = ["/usr/local/cuda", "/opt/cuda"]


This is nerve wrecking...

Yes I am looking for a better solution , this looks ugly.

leofang · 2026-02-12T14:56:51Z

cuda_pathfinder/cuda/pathfinder/_static_libs/find_libdevice.py

+    return abs_path
+
+
+def get_libdevice_path() -> str | None:


I agree, having both get_libdevice_path and find_libdevice is confusing

abhilash1910 · 2026-02-12T19:13:38Z

pre-commit.ci autofix

add ltoir test support

a4db20b

abhilash1910 marked this pull request as draft November 5, 2025 02:17

leofang assigned abhilash1910 Nov 10, 2025

leofang added this to the cuda.core beta 9 milestone Nov 10, 2025

leofang added enhancement Any code-related improvements P1 Medium priority - Should do cuda.core Everything related to the cuda.core module labels Nov 10, 2025

abhilash1910 added 3 commits November 25, 2025 04:10

add options for multi-modules

fb6cfb3

add tests

7aaed4e

add bitcode test

64c7f7d

pre-commit-ci bot and others added 2 commits November 25, 2025 14:51

[pre-commit.ci] auto code formatting

42ba301

fix format

7ca6899

[pre-commit.ci] auto code formatting

0674ea1

leofang linked an issue Nov 25, 2025 that may be closed by this pull request

NVVM support - follow-up #981

Open

leofang reviewed Nov 25, 2025

View reviewed changes

abhilash1910 added 6 commits November 26, 2025 10:26

refresh

03b1224

apply bitcode file from cupy_test helpers

033f11c

use 2 tuples

6e411ee

Merge branch 'main' into nvvm_enhance

b4c21db

refresh

aeb26aa

format

b3d6d96

[pre-commit.ci] auto code formatting

edd6401

leofang reviewed Dec 5, 2025

View reviewed changes

cuda_core/cuda/core/experimental/_program.py Outdated Show resolved Hide resolved

leofang reviewed Dec 5, 2025

View reviewed changes

cuda_core/cuda/core/experimental/_program.py Outdated Show resolved Hide resolved

Merge branch 'main' into nvvm_enhance

d53e00b

Merge branch 'main' into nvvm_enhance

f89aac8

abhilash1910 and others added 4 commits February 11, 2026 00:30

Merge branch 'main' into nvvm_enhance

0ad13ae

tests

dcdd100

Address reviews

aca2e36

put libdevice stuff under _static_libs

af6e70a

refresh reviews

b1d423f

abhilash1910 added 2 commits February 11, 2026 17:37

change program to cython per PR 1565

4cedbb7

Merge branch 'main' into nvvm_enhance

d9aed9b

abhilash1910 marked this pull request as ready for review February 11, 2026 17:42

abhilash1910 added 4 commits February 11, 2026 19:54

fix import

ca32d2b

fix tests

fac1907

fix ruff check

4a01e06

ruff fix find_libdevice

2d5252f

pre-commit-ci bot and others added 2 commits February 12, 2026 04:46

[pre-commit.ci] auto code formatting

2976c24

add spdx and copyright

61c1e00

rm redundant include and fix test

c6bea0c

[pre-commit.ci] auto code formatting

b7866cf

leofang reviewed Feb 12, 2026

View reviewed changes

abhilash1910 added 2 commits February 12, 2026 18:58

refresh tests

4283230

add correct libdevice for CTK> 13

78f4328

[pre-commit.ci] auto code formatting

ddf4839

	MINIMAL_NVVMIR_TXT_TEMPLATE = b"""\
	target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-i128:128:128-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"

	target triple = "nvptx64-nvidia-cuda"

	define void @kernel() {
	entry:
	ret void
	}

	!nvvm.annotations = !{!0}
	!0 = !{void ()* @kernel, !"kernel", i32 1}

	!nvvmir.version = !{!1}
	!1 = !{i32 %d, i32 0, i32 %d, i32 0}
	""" # noqa: E501

	MINIMAL_NVVMIR_BITCODE_STATIC = {
	(1, 3): # (major, debug_major)
	"4243c0de3514000005000000620c30244a59be669dfbb4bf0b51804c01000000210c00007f010000"
	"0b02210002000000160000000781239141c80449061032399201840c250508191e048b62800c4502"
	"42920b42641032143808184b0a3232884870c421234412878c1041920264c808b1142043468820c9"
	"01323284182a282a90317cb05c9120c3c8000000892000000b0000003222c80820624600212b2498"
	"0c212524980c19270c85a4906032645c20246382a01801300128030173046000132677b00778a007"
	"7cb0033a680377b0877420877408873618877a208770d8e012e5d006f0a0077640077a600774a007"
	"7640076d900e71a00778a00778d006e980077a80077a80076d900e7160077a100776a0077160076d"
	"900e7320077a300772a0077320076d900e7640077a600774a0077640076d900e71200778a0077120"
	"0778a00771200778d006e6300772a0077320077a300772d006e6600774a0077640077a600774d006"
	"f6100776a0077160077a100776d006f6300772a0077320077a300772d006f6600774a0077640077a"
	"600774d006f610077280077a10077280077a10077280076de00e7160077a300772a0077640071a21"
	"4c0e11de9c2e4fbbcfbe211560040000000000000000000000000620b141a0e86000004016080000"
	"06000000321e980c19114c908c092647c6044362098c009401000000b1180000ac0000003308801c"
	"c4e11c6614013d88433884c38c4280077978077398710ce6000fed100ef4800e330c421ec2c11dce"
	"a11c6630053d88433884831bcc033dc8433d8c033dcc788c7470077b08077948877070077a700376"
	"788770208719cc110eec900ee1300f6e300fe3f00ef0500e3310c41dde211cd8211dc2611e663089"
	"3bbc833bd04339b4033cbc833c84033bccf0147660077b6807376887726807378087709087706007"
	"76280776f8057678877780875f08877118877298877998812ceef00eeee00ef5c00eec300362c8a1"
	"1ce4a11ccca11ce4a11cdc611cca211cc4811dca6106d6904339c84339984339c84339b8c3389443"
	"3888033b94c32fbc833cfc823bd4033bb0c30cc7698770588772708374680778608774188774a087"
	"19ce530fee000ff2500ee4900ee3400fe1200eec500e3320281ddcc11ec2411ed2211cdc811edce0"
	"1ce4e11dea011e66185138b0433a9c833bcc50247660077b68073760877778077898514cf4900ff0"
	"500e331e6a1eca611ce8211ddec11d7e011ee4a11ccc211df0610654858338ccc33bb0433dd04339"
	"fcc23ce4433b88c33bb0c38cc50a877998877718877408077a28077298815ce3100eecc00ee5500e"
	"f33023c1d2411ee4e117d8e11dde011e6648193bb0833db4831b84c3388c4339ccc33cb8c139c8c3"
	"3bd4033ccc48b471080776600771088771588719dbc60eec600fede006f0200fe5300fe5200ff650"
	"0e6e100ee3300ee5300ff3e006e9e00ee4500ef83023e2ec611cc2811dd8e117ec211de6211dc421"
	"1dd8211de8211f66209d3bbc433db80339948339cc58bc7070077778077a08077a488777708719cb"
	"e70eef300fe1e00ee9400fe9a00fe530c3010373a8077718875f988770708774a08774d087729881"
	"844139e0c338b0433d904339cc40c4a01dcaa11de0411edec11c662463300ee1c00eec300fe9400f"
	"e5000000792000001d000000721e482043880c19097232482023818c9191d144a01028643c313242"
	"8e9021a318100a00060000006b65726e656c0000230802308240042308843082400c330c4230cc40"
	"0c4441c84860821272b3b36b730973737ba30ba34b7b739b1b2528d271b3b36b4b9373b12b939b4b"
	"7b731b2530000000a9180000250000000b0a7228877780077a587098433db8c338b04339d0c382e6"
	"1cc6a10de8411ec2c11de6211de8211ddec11d1634e3600ee7500fe1200fe4400fe1200fe7500ef4"
	"b08081077928877060077678877108077a28077258709cc338b4013ba4833d94c3026b1cd8211cdc"
	"e11cdc201ce4611cdc201ce8811ec2611cd0a11cc8611cc2811dd861c1010ff4200fe1500ff4800e"
	"00000000d11000000600000007cc3ca4833b9c033b94033da0833c94433890c30100000061200000"
	"06000000130481860301000002000000075010cd14610000000000007120000003000000320e1022"
	"8400fb020000000000000000650c00001f000000120394f000000000030000000600000006000000"
	"4c000000010000005800000000000000580000000100000070000000000000000c00000013000000"
	"1f000000080000000600000000000000700000000000000000000000010000000000000000000000"
	"060000000000000006000000ffffffff00240000000000005d0c00000d0000001203946700000000"
	"6b65726e656c31352e302e376e7670747836342d6e76696469612d637564613c737472696e673e00"
	"00000000",
	(2, 3): # (major, debug_major)
	"4243c0de3514000005000000620c30244a59be669dfbb4bf0b51804c01000000210c000080010000"
	"0b02210002000000160000000781239141c80449061032399201840c250508191e048b62800c4502"
	"42920b42641032143808184b0a3232884870c421234412878c1041920264c808b1142043468820c9"
	"01323284182a282a90317cb05c9120c3c8000000892000000b0000003222c80820624600212b2498"
	"0c212524980c19270c85a4906032645c20246382a01801300128030173046000132677b00778a007"
	"7cb0033a680377b0877420877408873618877a208770d8e012e5d006f0a0077640077a600774a007"
	"7640076d900e71a00778a00778d006e980077a80077a80076d900e7160077a100776a0077160076d"
	"900e7320077a300772a0077320076d900e7640077a600774a0077640076d900e71200778a0077120"
	"0778a00771200778d006e6300772a0077320077a300772d006e6600774a0077640077a600774d006"
	"f6100776a0077160077a100776d006f6300772a0077320077a300772d006f6600774a0077640077a"
	"600774d006f610077280077a10077280077a10077280076de00e7160077a300772a0077640071a21"
	"4c0e11de9c2e4fbbcfbe211560040000000000000000000000000620b141a0286100004016080000"
	"06000000321e980c19114c908c092647c60443620914c10840190000b1180000ac0000003308801c"
	"c4e11c6614013d88433884c38c4280077978077398710ce6000fed100ef4800e330c421ec2c11dce"
	"a11c6630053d88433884831bcc033dc8433d8c033dcc788c7470077b08077948877070077a700376"
	"788770208719cc110eec900ee1300f6e300fe3f00ef0500e3310c41dde211cd8211dc2611e663089"
	"3bbc833bd04339b4033cbc833c84033bccf0147660077b6807376887726807378087709087706007"
	"76280776f8057678877780875f08877118877298877998812ceef00eeee00ef5c00eec300362c8a1"
	"1ce4a11ccca11ce4a11cdc611cca211cc4811dca6106d6904339c84339984339c84339b8c3389443"
	"3888033b94c32fbc833cfc823bd4033bb0c30cc7698770588772708374680778608774188774a087"
	"19ce530fee000ff2500ee4900ee3400fe1200eec500e3320281ddcc11ec2411ed2211cdc811edce0"
	"1ce4e11dea011e66185138b0433a9c833bcc50247660077b68073760877778077898514cf4900ff0"
	"500e331e6a1eca611ce8211ddec11d7e011ee4a11ccc211df0610654858338ccc33bb0433dd04339"
	"fcc23ce4433b88c33bb0c38cc50a877998877718877408077a28077298815ce3100eecc00ee5500e"
	"f33023c1d2411ee4e117d8e11dde011e6648193bb0833db4831b84c3388c4339ccc33cb8c139c8c3"
	"3bd4033ccc48b471080776600771088771588719dbc60eec600fede006f0200fe5300fe5200ff650"
	"0e6e100ee3300ee5300ff3e006e9e00ee4500ef83023e2ec611cc2811dd8e117ec211de6211dc421"
	"1dd8211de8211f66209d3bbc433db80339948339cc58bc7070077778077a08077a488777708719cb"
	"e70eef300fe1e00ee9400fe9a00fe530c3010373a8077718875f988770708774a08774d087729881"
	"844139e0c338b0433d904339cc40c4a01dcaa11de0411edec11c662463300ee1c00eec300fe9400f"
	"e5000000792000001e000000721e482043880c19097232482023818c9191d144a01028643c313242"
	"8e9021a318100a00060000006b65726e656c0000230802308240042308843082400c23080431c320"
	"04c30c045118858c04262821373bbb36973037b737ba30bab437b7b95102231d373bbbb6343917bb"
	"32b9b9b437b7518203000000a9180000250000000b0a7228877780077a587098433db8c338b04339"
	"d0c382e61cc6a10de8411ec2c11de6211de8211ddec11d1634e3600ee7500fe1200fe4400fe1200f"
	"e7500ef4b08081077928877060077678877108077a28077258709cc338b4013ba4833d94c3026b1c"
	"d8211cdce11cdc201ce4611cdc201ce8811ec2611cd0a11cc8611cc2811dd861c1010ff4200fe150"
	"0ff4800e00000000d11000000600000007cc3ca4833b9c033b94033da0833c94433890c301000000"
	"6120000006000000130481860301000002000000075010cd14610000000000007120000003000000"
	"320e10228400fc020000000000000000650c00001f000000120394f0000000000300000006000000"
	"060000004c000000010000005800000000000000580000000100000070000000000000000c000000"
	"130000001f0000000800000006000000000000007000000000000000000000000100000000000000"
	"00000000060000000000000006000000ffffffff00240000000000005d0c00000d00000012039467"
	"000000006b65726e656c31352e302e376e7670747836342d6e76696469612d637564613c73747269"
	"6e673e0000000000",
	}


	@pytest.fixture(params=("txt", "bitcode_static"))
	def minimal_nvvmir(request):
	major, minor, debug_major, debug_minor = nvvm.ir_version()

	if request.param == "txt":
	return MINIMAL_NVVMIR_TXT_TEMPLATE % (major, debug_major)

	bitcode_static_binascii = MINIMAL_NVVMIR_BITCODE_STATIC.get((major, debug_major))
	if bitcode_static_binascii:
	return binascii.unhexlify(bitcode_static_binascii)
	raise RuntimeError(
	"Static bitcode for NVVM IR version "
	f"{major}.{debug_major} is not available in this test.\n"
	"Maintainers: Please run the helper script to generate it and add the "
	"output to the MINIMAL_NVVMIR_BITCODE_STATIC dict:\n"
	" ../../toolshed/build_static_bitcode_input.py"
	)

Conversation

abhilash1910 commented Nov 5, 2025

Description

Changes to be addressed in this WIP PR:

Uh oh!

copy-pr-bot bot commented Nov 5, 2025

Uh oh!

leofang commented Nov 17, 2025

Uh oh!

abhilash1910 commented Nov 25, 2025

Uh oh!

abhilash1910 commented Nov 25, 2025

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abhilash1910 commented Dec 3, 2025

Uh oh!

Uh oh!

Uh oh!

brandon-b-miller commented Feb 6, 2026

Uh oh!

rwgk commented Feb 10, 2026

Uh oh!

abhilash1910 commented Feb 10, 2026

Uh oh!

brandon-b-miller commented Feb 11, 2026

Uh oh!

abhilash1910 commented Feb 11, 2026

Uh oh!

abhilash1910 commented Feb 12, 2026

Uh oh!

abhilash1910 commented Feb 12, 2026

Uh oh!

abhilash1910 commented Feb 12, 2026

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abhilash1910 commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants