Wednesday, July 10, 2019

Home PHP php-file_get_contents php-read-file-line-by-line php-read-specific-line-from-file php-readfile php-write-to-file PHP Files Processing Every Word in a File

PHP Files Processing Every Word in a File

PHP Files

Processing Every Word in a File

Problem

You want to do something with every word in a file. For example, you want to build a concordance of how many times each word is used to compute similarities between documents.

Solution

Read in each line with fgets(), separate the line into words, and process each word:

$fh = fopen('great-american-novel.txt','r') or die($php_errormsg);

while (! feof($fh)) {

if ($s = fgets($fh)) {

$words = preg_split('/\s+/',$s,-1,PREG_SPLIT_NO_EMPTY);

// process words

}

fclose($fh) or die($php_errormsg);

Discussion

This example calculates the average word length in a file:

$word_count = $word_length = 0;

if ($fh = fopen('great-american-novel.txt','r')) {

while (! feof($fh)) {

if ($s = fgets($fh)) {

$words = preg_split('/\s+/',$s,-1,PREG_SPLIT_NO_EMPTY);

foreach ($words as $word) {

$word_count++;

$word_length += strlen($word);

}

print sprintf("The average word length over %d words is %.02f characters.",

$word_count,

$word_length/$word_count);

Processing every word proceeds differently depending on how “word” is defined. The code in this recipe uses the Perl-compatible regular expression engine’s \s whitespace metacharacter, which includes space, tab, newline, carriage return, and formfeed.

Breaks apart a line into words by splitting on a space, which is useful because the words have to be rejoined with spaces. The Perl-compatible engine also has a word-boundary assertion (\b) that matches between a word character (alphanumeric) and a nonword character (anything else). Using \b instead of \s to delimit words most noticeably treats words with embedded punctuation differently. The term 6 o’clock is two words when split by whitespace (6 and o’clock); it’s four words when split by word boundaries (6, o, ', and clock).

Breaking

React JS Installation | Create Project React JS | How to Install Node JS for React JS Development

Javascript DOM Tutorial Part 1 [ Selectors ] How to Select HTML Elements Using Javascript

Python Django Medical Store Management Part 7 | Multiple Serializer in ViewSet | Send JSON Request

Python Django Medical Store Management System Part 6 | Complete All Serializers and Medicine Viewset

JavaScript Advance Functions Complete Tutorial Part 9 | All About Different Types of Functions in JS

Post Top Ad

Post Top Ad

Wednesday, July 10, 2019

PHP Files Processing Every Word in a File

No comments:

Post a Comment

Post Top Ad

Author Details

Subscribe Our Youtube Channel

Featured Post

Total Pageviews

Translate

Advertisement

React JS Installation | Create Project React JS | How to Install Node JS for React JS Development

Javascript DOM Tutorial Part 1 [ Selectors ] How to Select HTML Elements Using Javascript

Python Django Medical Store Management Part 7 | Multiple Serializer in ViewSet | Send JSON Request

Python Django Medical Store Management System Part 6 | Complete All Serializers and Medicine Viewset

JavaScript Advance Functions Complete Tutorial Part 9 | All About Different Types of Functions in JS

Ads

Archive

Technology

Tags

Contact Form

Breaking

Post Top Ad

Post Top Ad

Wednesday, July 10, 2019

PHP Files Processing Every Word in a File

No comments:

Post a Comment

Post Top Ad

Author Details

Edit This Menu

Join Our Telegram Channel to Stay Updated

Socialize

Subscribe Our Youtube Channel

Featured Post

Total Pageviews

Translate

Advertisement

Ads

Archive

Technology

Tags

Contact Form