README.md (2060B)
1 <!-- 2 3 @license Apache-2.0 4 5 Copyright (c) 2018 The Stdlib Authors. 6 7 Licensed under the Apache License, Version 2.0 (the "License"); 8 you may not use this file except in compliance with the License. 9 You may obtain a copy of the License at 10 11 http://www.apache.org/licenses/LICENSE-2.0 12 13 Unless required by applicable law or agreed to in writing, software 14 distributed under the License is distributed on an "AS IS" BASIS, 15 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 See the License for the specific language governing permissions and 17 limitations under the License. 18 19 --> 20 21 # tokenize 22 23 > Tokenize a string. 24 25 <section class="intro"> 26 27 </section> 28 29 <!-- /.intro --> 30 31 <section class="usage"> 32 33 ## Usage 34 35 ```javascript 36 var tokenize = require( '@stdlib/nlp/tokenize' ); 37 ``` 38 39 #### tokenize( str\[, keepWhitespace] ) 40 41 Tokenizes a string. 42 43 ```javascript 44 var str = 'Hello Mrs. Maple, could you call me back?'; 45 var out = tokenize( str ); 46 // returns [ 'Hello', 'Mrs.', 'Maple', ',', 'could', 'you', 'call', 'me', 'back', '?' ] 47 ``` 48 49 To include whitespace characters (spaces, tabs, line breaks) in the output array, set `keepWhitespace` to `true`. 50 51 ```javascript 52 var str = 'Hello World!\n'; 53 var out = tokenize( str, true ); 54 // returns [ 'Hello', ' ', 'World', '!', '\n' ] 55 ``` 56 57 </section> 58 59 <!-- /.usage --> 60 61 <section class="examples"> 62 63 ## Examples 64 65 <!-- eslint no-undef: "error" --> 66 67 ```javascript 68 var tokenize = require( '@stdlib/nlp/tokenize' ); 69 70 console.log( tokenize( 'Hello World!' ) ); 71 // => [ 'Hello', 'World', '!' ] 72 73 console.log( tokenize( '' ) ); 74 // => [] 75 76 var str = 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod.'; 77 console.log( tokenize( str ) ); 78 /* => 79 [ 80 'Lorem', 81 'ipsum', 82 'dolor', 83 'sit', 84 'amet', 85 ',', 86 'consetetur', 87 'sadipscing', 88 'elitr', 89 ',', 90 'sed', 91 'diam', 92 'nonumy', 93 'eirmod', 94 '.' 95 ] 96 */ 97 ``` 98 99 </section> 100 101 <!-- /.examples --> 102 103 <section class="links"> 104 105 </section> 106 107 <!-- /.links -->