gargantext-0.0.4.9.5: Search, map, share
Copyright (c) CNRS 2017-Present AGPL + CECILL v3 team@gargantext.org experimental POSIX None Haskell2010

Gargantext.Core.Text.Metrics.Count

Description

Token and occurrence

An occurrence is not necessarily a token. Considering the sentence: "A rose is a rose is a rose". We may equally correctly state that there are eight or three words in the sentence. There are, in fact, three word types in the sentence: "rose", "is" and "a". There are eight word tokens in a token copy of the line. The line itself is a type. There are not eight word types in the line. It contains (as stated) only the three word types, a, is and rose, each of which is unique. So what do we call what there are eight of? They are occurrences of words. There are three occurrences of the word type a, two of is and three of rose. Source : https://en.wikipedia.org/wiki/Type%E2%80%93token_distinction#Occurrences

Synopsis

# Documentation

type Occ a = Map a Int Source #

type Cooc a = Map (a, a) Int Source #

type FIS a = Map (Set a) Int Source #

data Group Source #

Constructors

 ByStem ByOntology

type Occs = Int Source #

type Coocs = Int Source #

cooc :: [[Terms]] -> Map ([Text], [Text]) Int Source #

coocOnWithLabel :: (Ord label, Ord b) => (a -> b) -> (b -> label) -> [[a]] -> Map (label, label) Coocs Source #

coocOn :: Ord b => (a -> b) -> [[a]] -> Map (b, b) Int Source #

coocOn' :: Ord b => (a -> b) -> [a] -> Map (b, b) Int Source #

coocOnContexts :: (a -> [Text]) -> [[a]] -> Map ([Text], [Text]) Int Source #

coocOnSingleContext :: (a -> [Text]) -> [a] -> [(([Text], [Text]), Int)] Source #

Compute the grouped occurrences (occ)

occurrencesOn :: (Ord a, Ord b) => (a -> b) -> [a] -> Map b (Map a Int) Source #

occurrencesWith :: (Foldable list, Ord k, Num a) => (b -> k) -> list b -> Map k a Source #

sumOcc :: Ord a => [Occ a] -> Occ a Source #