In this section we will do explore some more features of spacy ,Spacy is the open source Natural Language Processing(NLP)Framework.
3)Use hash values for any string
This feature can able to produce hash value for a particular string and in return if we give the hash value it will return the correct string and it is fixed value and it will not change at any cost.
The basic idea of hashing is that numbers can be processed easily rather than processing the strings because computers can understand only in the form of numbers.
CODE
import spacy
nlp = spacy.load(‘en_core_web_sm’)
doc = nlp(u’Inida is my country’)
India_hash = nlp.vocab.strings[u’India’] # 3197928453018144401
India_text = nlp.vocab.strings[India_hash] # ‘India’
print(coffee_hash, coffee_text)
Note:The numbers generated will be in a random way and i have in a random way.
4)Recognise and update named entities
A named entity is a real world object used to identify the real word object for example a person,book ,cars and money and many more.
Spacy can recognize various types of entities in the given string by parsing the entire string and find out various entity.Because models are statistical and strongly depend on the examples they were trained on, this doesn’t always work perfectly and might need some tuning later, depending on your use case.
CODE
import spacy
nlp = spacy.load(‘en_core_web_sm’)
doc = nlp(u’The man who of awsome is going to work in Apple and he will buy the company as $10 million dollars’)
for ent in doc.ents:
print(ent.text, ent.start_char, ent.end_char, ent.label_)
5)word Vectors and Similarity
spaCy is able to compare two objects, and make a prediction of how similar they are. Predicting similarity is useful for building recommendation systems or flagging duplicates.
CODE
import spacy
nlp = spacy.load(‘en_core_web_md’) # make sure to use larger model!
tokens = nlp(u’dog cat banana’)
for token1 in tokens:
for token2 in tokens:
print(token1.text, token2.text, token1.similarity(token2))