👋 你好,我是褚成志
云基础设施 / 云原生 / AI 算力技术服务专家,这个博客记录我在技术成长路上的思考与实践,欢迎交流。
博客内容涵盖以下方向:云原生与容器(Kubernetes、Docker、Istio、Service Mesh)· 运维与可观测性(Linux、监控、Ansible、性能分析、DevOps)· 后端开发(Java、Spring、微服务、分布式、消息队列)· 大数据与存储(Elasticsearch、Hadoop、Redis、MongoDB)· 系统基础(操作系统、计算机网络、JVM、并发)
🌐 About Me
Cloud Infrastructure / Cloud-Native / AI Computing Technical Expert based in China.
I specialize in large-scale private cloud delivery and operations at Huawei, with a focus on AI computing infrastructure. My work spans GPU / Ascend heterogeneous cluster deployment and tuning, distributed inference environments for large models (including DeepSeek in government scenarios), and high-performance networking optimization (InfiniBand / RoCEv2 / HCCL / RDMA).
Beyond AI workloads, I handle stability governance for clusters at 8000+ physical node scale — building observability stacks, defining SLO/SLA standards, and driving large-scale automation with Python and Ansible.
📬 Contact
- 🔗 GitHub:initchu
