[NVVM] Support - Followup enhancements#1218
[NVVM] Support - Followup enhancements#1218abhilash1910 wants to merge 48 commits intoNVIDIA:mainfrom
Conversation
|
Thanks, @abhilash1910! Any ETA to wrap this up? |
|
pre-commit.ci autofix |
|
pre-commit.ci autofix |
leofang
left a comment
There was a problem hiding this comment.
Thanks, Abhilash! Leaving a few early feedbacks.
cuda_core/tests/test_program.py
Outdated
| bitcode_path = os.environ.get("BITCODE_NVVM_PATH") | ||
| if not bitcode_path: | ||
| pytest.skip("BITCODE_NVVM_PATH environment variable is not set.Disabling the test.") | ||
| bitcode_file = Path(bitcode_path) | ||
| if not bitcode_file.exists(): | ||
| pytest.skip(f"Bitcode file not found: {bitcode_path}") | ||
|
|
||
| if bitcode_file.suffix != ".bc": | ||
| pytest.skip(f"Expected .bc file, got: {bitcode_file.suffix}") |
There was a problem hiding this comment.
Would it be possible for us to avoid having a file locally? We have bitcode in this repo already:
cuda-python/cuda_bindings/tests/test_nvvm.py
Lines 12 to 141 in b9c76b3
so I suggest that we move it to the common place, say a new file under
cuda_python_test_helpers:https://github.com/NVIDIA/cuda-python/tree/main/cuda_python_test_helpers/cuda_python_test_helpers
and have it imported in both cuda.bindings/core tests.
There was a problem hiding this comment.
Agree on this.
There was a problem hiding this comment.
This should be partially addressed now as I have yet to remove the existing invocations from cuda_bindings test
|
pre-commit.ci autofix |
|
hi @abhilash1910, @rwgk, what do you think of this patch as a basis for some testing? This uses a temporary dir and creates the expected directory structure for each discovery method within it. The wheel test patches out the base site-packages dir that's expected, and the conda and CUDA_HOME methods control what's visible through either the Details |
This looks good to me. I convinced myself that the suggested code is @abhilash1910 I believe @brandon-b-miller cannot push to this PR unless you give him push permission to this branch in your fork, but we could clone the commits here to a new branch/PR and develop the Cursor analysis:
So yes, those env edits are |
|
Yes @brandon-b-miller let me know if you received the invite. The change looks good to me as well. |
|
Thanks @abhilash1910 I hope you don't mind I've pushed a few commits here hopefully addressing some of the remaining reviews. Thanks again for this implementation! @rwgk could you take another look when you get a chance? |
|
Thanks @brandon-b-miller for the help :) |
|
pre-commit.ci autofix |
|
@rwgk @leofang @brandon-b-miller @kkraus14 requesting review. Thanks |
|
pre-commit.ci autofix |
leofang
left a comment
There was a problem hiding this comment.
No issues with the cuda.core part. We'd need more thoughts on the pathfinder, though.
cuda_core/tests/test_program.py
Outdated
| @nvvm_available | ||
| @pytest.mark.parametrize( | ||
| "options", | ||
| [ | ||
| ProgramOptions(name="ltoir_test1", arch="sm_90", device_code_optimize=False), | ||
| ProgramOptions(name="ltoir_test2", arch="sm_100", link_time_optimization=True), | ||
| ProgramOptions( | ||
| name="ltoir_test3", | ||
| arch="sm_90", | ||
| ftz=True, | ||
| prec_sqrt=False, | ||
| prec_div=False, | ||
| fma=True, | ||
| device_code_optimize=True, | ||
| link_time_optimization=True, | ||
| ), | ||
| ], | ||
| ) | ||
| def test_nvvm_program_options_ltoir(init_cuda, nvvm_ir, options): | ||
| """Test NVVM programs for LTOIR with different options""" | ||
| program = Program(nvvm_ir, "nvvm", options) | ||
| assert program.backend == "NVVM" | ||
|
|
||
| ltoir_code = program.compile("ltoir") | ||
| assert isinstance(ltoir_code, ObjectCode) | ||
| assert ltoir_code.name == options.name | ||
| code_content = ltoir_code.code | ||
| assert len(code_content) > 0 | ||
| program.close() |
There was a problem hiding this comment.
Q: Can this test be combined with the one above (test_nvvm_program_options) and parametrized over target=("ptx", "ltoir")?
| # Add extra modules if provided | ||
| if options.extra_sources is not None: | ||
| if not is_sequence(options.extra_sources): | ||
| raise TypeError( | ||
| "extra_sources must be a sequence of 2-tuples: ((name1, source1), (name2, source2), ...)" | ||
| ) | ||
| for i, module in enumerate(options.extra_sources): | ||
| if not isinstance(module, tuple) or len(module) != 2: | ||
| raise TypeError( | ||
| f"Each extra module must be a 2-tuple (name, source)" | ||
| f", got {type(module).__name__} at index {i}" | ||
| ) | ||
|
|
||
| module_name, module_source = module | ||
|
|
||
| if not isinstance(module_name, str): | ||
| raise TypeError(f"Module name at index {i} must be a string, got {type(module_name).__name__}") | ||
|
|
||
| if isinstance(module_source, str): | ||
| # Textual LLVM IR - encode to UTF-8 bytes | ||
| module_source = module_source.encode("utf-8") | ||
| elif not isinstance(module_source, (bytes, bytearray)): | ||
| raise TypeError( | ||
| f"Module source at index {i} must be str (textual LLVM IR), bytes (textual LLVM IR or bitcode), " | ||
| f"or bytearray, got {type(module_source).__name__}" | ||
| ) | ||
|
|
||
| if len(module_source) == 0: | ||
| raise ValueError(f"Module source for '{module_name}' (index {i}) cannot be empty") |
There was a problem hiding this comment.
Not a blocker, we can handle it in the next PR. The option validation should be moved to under ProgramOptions.
| from cuda.pathfinder._static_libs.find_libdevice import ( | ||
| find_libdevice as find_libdevice, | ||
| ) | ||
| from cuda.pathfinder._static_libs.find_libdevice import ( | ||
| get_libdevice_path as get_libdevice_path, | ||
| ) |
There was a problem hiding this comment.
I feel libdevice is not special enough to have its own functions. How about we consolidate these with find_nvidia_binary_utility? @rwgk WDYT?
There was a problem hiding this comment.
Hm, .bc (bitcode library) seems conceptually very different than e.g. nvcc (executable).
I think cuda/pathfinder/_static_libs is a better fit, compared to cuda/pathfinder/_binaries.
For the API, to mirror what we have for headers, how about:
locate_bitcode_lib("device")to get something similar toLocatedHeaderDirfind_bitcode_lib("device")to get just theabs_path
For other static libs there could be locate_static_lib("cudart").
So we'd be lumping the locate_bitcode_lib and locate_static_lib implementations under _static_lib, but that'd be a hidden implementation detail.
| FILENAME = "libdevice.10.bc" | ||
| if IS_WINDOWS: | ||
| bases = [r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA", r"C:\CUDA"] | ||
| else: | ||
| bases = ["/usr/local/cuda", "/opt/cuda"] |
There was a problem hiding this comment.
Yes I am looking for a better solution , this looks ugly.
| return abs_path | ||
|
|
||
|
|
||
| def get_libdevice_path() -> str | None: |
There was a problem hiding this comment.
I agree, having both get_libdevice_path and find_libdevice is confusing
|
pre-commit.ci autofix |
Description
Issue Link - #981
Changes to be addressed in this WIP PR:
{If / when it is possible to add multiple modules, a test with code that uses something from libdevice is probably a good idea.
It's also useful to be able to lazily add a module}
cc @leofang