Newest 'text+nlp' Questions

0 votes

1 answer

22 views

Accuracy_score with same value in different classifiers methods

I'm doing a project, on Google Colab, for fake news classification with LIAR dataset. I am running with three differents features extractors (TF-IDF, DistilBERT and LLAMA 2) and seven classifiers (...

lucasa.lisboa

23

asked Jun 20 at 17:42

0 votes

1 answer

45 views

Benepar for syntactic segmentation

I want to use Benepar with a French model to do a syntactic segmentation. I followed the tutorial but I have always have this error RuntimeError: Error(s) in loading state_dict for ChartParser: ...

nassima.crt

1

asked Jun 6 at 8:57

1 vote

0 answers

48 views

Chunking text with bounding box values

I have used the Azure OCR service to extract text from PDFs. For each page in a PDF, the OCR output contains a list of text lines along with the bounding box values for that line. My original approach ...

AnonymousMe

559

asked May 29 at 7:54

0 votes

1 answer

34 views

in R converting a text file into a data frame

in R have a .txt file that i would like to extract data from as a character string. my .txt file is formatted like the following with a list separated by numbers. 1. [text1] 2. [text2] 3. [text3] and ...

sebastian.mendoza

5

asked May 13 at 18:24

0 votes

0 answers

60 views

Trying to cluster short survey answers (1 to 10 words). Am I on the right track?

Here's the explanation of what i want to fully make (its a project for school). A user just puts in a file with just the answers to whatever question was asked in the survey. 2.The machine finds ...

Shimz

1

asked Apr 30 at 7:50

0 votes

1 answer

535 views

Langchain sql agent with context

I am working on a langchain based SQL chat application and wanted my agent to understand context w.r.t the user session. For e.g. User - What is highest order placed in last placed? Bot - Order id : ...

matvi

13

asked Apr 27 at 7:49

1 vote

1 answer

137 views

ValueError: Cannot use a compiled regex as replacement pattern with regex=False

I'm doing a project, on Google Colab, where I use the following version: !pip install "gensim==4.2.0" !pip install "texthero==1.0.5" Until recently, I received the following ...

lucasa.lisboa

23

asked Apr 24 at 1:52

0 votes

0 answers

24 views

SVM algorithm training fitting doesnt work for text classification

I'm trying to fit the sentiment5 data which contains 2 varibales "tweet" that has vectorized text data (using TF-IDF) and "target" that has 1 and 0 for positive and negative. I ...

Arcane Persona

19

asked Mar 4 at 11:23

0 votes

0 answers

40 views

latex/mathematical text cleaning / mwparser

I am looking to build a data science focused search engine and had a question for those familiar with parsing text with mathematical notation. So I have set up a standard WikiAPI class with a method ...

goofy-data-scientist

41

asked Dec 15, 2023 at 22:59

0 votes

1 answer

79 views

How to decide correct NLP approach for a project

I'm working on an NLP project. My task is to determine the category and Sentiment Score of Turkish Call Center conversations from the conversations themselves. We are using Python as our programming ...

Bilal Sedef

103

asked Nov 2, 2023 at 13:45

1 vote

0 answers

52 views

I have Dataframe Spark and I want to generate Ngrams but the way gensim bigram model does it

I have a text dataframe (tweets), I am using Spark for high volume data handling and I want to generate Bigrams in the same way as Gensim bigrams models do. I have been using Spark NLP for ...

Criscas05

11

asked Oct 15, 2023 at 22:27

0 votes

0 answers

114 views

How to segment text in PDF files to get out some headings

Lets say that i have a couple hundred of PDF file from which i have to extract each heading and the relevant text, for further processing for each heading how do I do that keeping the format of the ...

USMAN SIDDIQUI

1

asked Oct 14, 2023 at 20:53

0 votes

0 answers

33 views

Faster approach to collect text data from multiple URL and save it to the dataframe rowwise for each URL

I have a DataFrame of shape (700000,5). One column of the DataFrame has single unique text file URL Example: Two columns showing for reference: Task identifier Text url ub12345567 https:/ / someadd....

Remrem

25

asked Sep 21, 2023 at 7:56

0 votes

0 answers

37 views

How can I identify the number of occurrences of multiple custom emotions, grouped by line, team, and personal ID?

I have a data frame like the following (but much larger and with repeated observations across time): df <- data.frame( participant_ID = 1:4, TeamID = c("A", "A", "B", ...

user22571454

1

asked Sep 16, 2023 at 3:53

Collectives™ on Stack Overflow

All Questions

Accuracy_score with same value in different classifiers methods

Benepar for syntactic segmentation

Chunking text with bounding box values

in R converting a text file into a data frame

Trying to cluster short survey answers (1 to 10 words). Am I on the right track?

Langchain sql agent with context

ValueError: Cannot use a compiled regex as replacement pattern with regex=False

SVM algorithm training fitting doesnt work for text classification

latex/mathematical text cleaning / mwparser

How to decide correct NLP approach for a project

I have Dataframe Spark and I want to generate Ngrams but the way gensim bigram model does it

How to segment text in PDF files to get out some headings

Faster approach to collect text data from multiple URL and save it to the dataframe rowwise for each URL

How can I identify the number of occurrences of multiple custom emotions, grouped by line, team, and personal ID?

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags