Skip to content
View hitdxh's full-sized avatar
Block or Report

Block or report hitdxh

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Distributed SQL Query Engine in Python using Ray

Rust 220 14 Updated Nov 20, 2023

RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.

Python 294 66 Updated Jun 30, 2024

This is the official repository for M2UGen

Jupyter Notebook 426 39 Updated May 8, 2024

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

2,843 657 Updated Jul 25, 2024

Curated list of project-based tutorials

185,323 24,281 Updated Jul 22, 2024

Tools to download and cleanup Common Crawl data

Python 941 139 Updated Apr 25, 2023

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,463 339 Updated Mar 20, 2024

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 32,224 5,488 Updated Jul 25, 2024

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Python 42,799 17,641 Updated Jul 25, 2024

Apache Spark - A unified analytics engine for large-scale data processing

Scala 38,996 28,117 Updated Jul 25, 2024

ClickHouse® is a real-time analytics DBMS

C++ 35,657 6,659 Updated Jul 25, 2024

Go HTTP framework with high-performance and strong-extensibility for building micro-services.

Go 4,912 478 Updated Jul 25, 2024

The Go programming language

Go 121,524 17,405 Updated Jul 25, 2024

Apache Flink

Java 23,566 13,131 Updated Jul 25, 2024

beego is an open-source, high-performance web framework for the Go programming language.

Go 713 180 Updated Apr 27, 2022

A golang ebook intro how to build a web with golang

Go 43,111 10,650 Updated May 12, 2024

公众号「宫水三叶的刷题日记」刷穿 LeetCode 系列文章源码

7,228 953 Updated Jul 20, 2024

Leetcode algorithm solutions together with self-made teaching videos

Java 72 22 Updated Aug 10, 2019
Jupyter Notebook 870 328 Updated Jan 21, 2020

爬虫集合

22,027 4,792 Updated Sep 27, 2023

Mining synonyms from unstructured and semi-structured data

Python 236 61 Updated May 7, 2024

一位酷爱做饭的程序员,立志用动画将算法说的通俗易懂。我的面试网站 www.chengxuchu.com

10,609 1,568 Updated May 11, 2023

Provide all my solutions and explanations in Chinese for all the Leetcode coding problems.

6,143 734 Updated Dec 9, 2023

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

173,896 50,788 Updated Jul 5, 2024
Jupyter Notebook 68 51 Updated Jun 29, 2022

此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。

Jupyter Notebook 15,523 4,495 Updated Jun 21, 2022

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

Python 3,252 664 Updated Jul 13, 2024

实体识别和关系抽取的联合模型

Python 117 26 Updated Jan 24, 2019
Next