Skip to content

Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda

Notifications You must be signed in to change notification settings

lix19937/tensorrt-insight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

1c75d93 · Mar 21, 2025
Nov 9, 2024
Oct 12, 2024
Oct 11, 2024
Sep 20, 2024
Jan 17, 2025
Oct 13, 2024
Sep 30, 2024
Apr 21, 2024
Sep 7, 2024
Dec 4, 2024
Mar 19, 2024
Nov 5, 2024
Sep 2, 2024
Mar 15, 2025
Feb 15, 2024
Mar 21, 2025
Sep 16, 2024
Oct 24, 2024
Feb 17, 2025
Nov 26, 2024
Nov 26, 2024
Feb 1, 2024

Repository files navigation

TensorRT 是Nvidia 推出的跨 nv-gpu架构的半开源高性能AI 推理引擎框架/库,提供了cpp/python接口,以及用户自定义插件方法,涵盖了AI 推理引擎技术的主要方面。

TensorRT is a semi-open source high-performance AI inference engine framework/library developed by Nvidia, which spans across nv-gpu architectures.
Provides cpp/python interfaces and user-defined plugin methods, covering the main aspects of AI inference engine technology.

topic 主题 notes
overview 概述
layout 内存布局
compute_graph_optimize 计算图优化
dynamic_shape 动态shape
plugin 插件
calibration 标定
asp 稀疏
qat 量化感知训练
trtexec OSS辅助工具
tool 辅助脚本
runtime 运行时
inferflow 模型调度
mps MPS
deploy 基于onnx部署流程, trt 工具使用
py-tensorrt python tensorrt封装 解析 tensorrt __init__
model_benchmark 模型性能测试
cookbook 食谱
incubator 孵化器
developer_guide 开发者指导
triton-inference-server triton
cuda cuda编程
onnxruntime op onnxrt 自定义op 辅助图优化,layer输出对齐

Reference

https://docs.nvidia.com/deeplearning/tensorrt/archives/
https://developer.nvidia.com/search?page=1&sort=relevance&term=
https://github.com/HeKun-NVIDIA/TensorRT-Developer_Guide_in_Chinese/tree/main
https://docs.nvidia.com/deeplearning/tensorrt/migration-guide/index.html
https://developer.nvidia.com/zh-cn/blog/nvidia-gpu-fp8-training-inference/

About

Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published