ML.NET how to handle data classification for extra-long texts？

1. Now I have a large number of conversation recordings between staff and customers and get long text data in the following (original conversation example) format through ASR. 
2. The beginning of the sample data is the beginning and end time of each sentence, followed by the role, and the colon starts with the sentence content. 
3. During the reception of the staff throughout the day, we will have a dialogue with multiple groups of customers. Each batch of customers may be one person or multiple people.
4. I need to train a 'session boundary detection' model with multiple dialogue sentence paragraphs as input.
5. Predicts whether the current input dialog segment has a boundary point for the start or end of the session, and returns the start time and boundary label value of the corresponding sentence as 1 or 0. The model needs to be able to segment the dialogue between the staff and each customer.
6. The following is an example of the data.
11:03:42-11:03:42 ：Hello, do you need help?
........
........
........
12:03:42-12:03:42 ：Please walk slowly and welcome to the next visit.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ML.NET how to handle data classification for extra-long texts？ #7456

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ML.NET how to handle data classification for extra-long texts？ #7456

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions