Tesi etd-01242020-113350

Tipo di tesi

Tesi di laurea magistrale

Autore

LUCCARDA, FRANCESCO

URN

etd-01242020-113350

Titolo

Casting one's net wide in the web: BootCaT as a tool for comparable and diachronic specialized-corpora collection

Titolo in inglese

Struttura

Dipartimento di Studi Linguistici e Culturali

Corso di studi

LANGUAGES FOR COMMUNICATION IN INTERNATIONAL ENTERPRISES AND ORGANIZATIONS - LINGUE PER LA COMUNICAZIONE NELL'IMPRESA E NELLE ORGANIZZAZIONI INTERNAZIONALI

Commissione

Nome Commissario	Qualifica
POPPI FRANCA	Primo relatore
DIANI GIULIANA	Correlatore

Parole chiave

BootCat
CDA
corpus linguistics
internet
Web-corpus

Data inizio appello

2020-03-16

Disponibilità

Accesso limitato: si può decidere quali file della tesi rendere accessibili. Disponibilità mixed (scegli questa opzione se vuoi rendere inaccessibili tutti i file della tesi o parte di essi)

Data di rilascio

2060-03-16

Riassunto analitico

Since 2004, the BootCaT software is being developed to help linguists quickly build disposable corpora for translation, terminological databases and machine-learning tasks via automatic web pages’ collection based on user-defined keywords. The present work attempts to utilize the software for the creation of comparable and diachronic web-corpora of different languages (English, German and Italian). It reports how the standard BootCat procedure has been adapted and integrated for this purpose and discusses the quality and usefulness of the obtained corpora. Taking these results into consideration, it recommends adjustments for future research attempts in this direction and hypothesizes ideal developments of linguistics in the automation of text collection and analysis.

Abstract

File

Nome file	Dimensione	Tempo di download stimato (Ore:Minuti:Secondi)
Nome file	Dimensione	28.8 Modem	56K Modem	ISDN (64 Kb)	ISDN (128 Kb)	piu' di 128 Kb
Ci sono 1 file riservati su richiesta dell'autore.
Contatta l'autore