← Index
NYTProf Performance Profile   « line view »
For ../prof.pl
  Run on Wed Dec 14 15:57:08 2022
Reported on Wed Dec 14 16:00:30 2022

Filename/Users/ether/.perlbrew/libs/36.0@std/lib/perl5/URI/Escape.pm
StatementsExecuted 274 statements in 865µs
Subroutines
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
11117µs20µsURI::Escape::::BEGIN@3URI::Escape::BEGIN@3
11114µs27µsURI::Escape::::BEGIN@147URI::Escape::BEGIN@147
1115µs19µsURI::Escape::::BEGIN@191URI::Escape::BEGIN@191
1114µs26µsURI::Escape::::BEGIN@4URI::Escape::BEGIN@4
1112µs2µsURI::Escape::::BEGIN@153URI::Escape::BEGIN@153
2112µs2µsURI::Escape::::CORE:qrURI::Escape::CORE:qr (opcode)
0000s0sURI::Escape::::_fail_hiURI::Escape::_fail_hi
0000s0sURI::Escape::::escape_charURI::Escape::escape_char
0000s0sURI::Escape::::uri_escapeURI::Escape::uri_escape
0000s0sURI::Escape::::uri_escape_utf8URI::Escape::uri_escape_utf8
0000s0sURI::Escape::::uri_unescapeURI::Escape::uri_unescape
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1package URI::Escape;
2
3225µs223µs
# spent 20µs (17+3) within URI::Escape::BEGIN@3 which was called: # once (17µs+3µs) by OpenAPI::Modern::BEGIN@25 at line 3
use strict;
# spent 20µs making 1 call to URI::Escape::BEGIN@3 # spent 3µs making 1 call to strict::import
4257µs248µs
# spent 26µs (4+22) within URI::Escape::BEGIN@4 which was called: # once (4µs+22µs) by OpenAPI::Modern::BEGIN@25 at line 4
use warnings;
# spent 26µs making 1 call to URI::Escape::BEGIN@4 # spent 22µs making 1 call to warnings::import
5
6=head1 NAME
7
8URI::Escape - Percent-encode and percent-decode unsafe characters
9
10=head1 SYNOPSIS
11
12 use URI::Escape;
13 $safe = uri_escape("10% is enough\n");
14 $verysafe = uri_escape("foo", "\0-\377");
15 $str = uri_unescape($safe);
16
17=head1 DESCRIPTION
18
19This module provides functions to percent-encode and percent-decode URI strings as
20defined by RFC 3986. Percent-encoding URI's is informally called "URI escaping".
21This is the terminology used by this module, which predates the formalization of the
22terms by the RFC by several years.
23
24A URI consists of a restricted set of characters. The restricted set
25of characters consists of digits, letters, and a few graphic symbols
26chosen from those common to most of the character encodings and input
27facilities available to Internet users. They are made up of the
28"unreserved" and "reserved" character sets as defined in RFC 3986.
29
30 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
31 reserved = ":" / "/" / "?" / "#" / "[" / "]" / "@"
32 "!" / "$" / "&" / "'" / "(" / ")"
33 / "*" / "+" / "," / ";" / "="
34
35In addition, any byte (octet) can be represented in a URI by an escape
36sequence: a triplet consisting of the character "%" followed by two
37hexadecimal digits. A byte can also be represented directly by a
38character, using the US-ASCII character for that octet.
39
40Some of the characters are I<reserved> for use as delimiters or as
41part of certain URI components. These must be escaped if they are to
42be treated as ordinary data. Read RFC 3986 for further details.
43
44The functions provided (and exported by default) from this module are:
45
46=over 4
47
48=item uri_escape( $string )
49
50=item uri_escape( $string, $unsafe )
51
52Replaces each unsafe character in the $string with the corresponding
53escape sequence and returns the result. The $string argument should
54be a string of bytes. The uri_escape() function will croak if given a
55characters with code above 255. Use uri_escape_utf8() if you know you
56have such chars or/and want chars in the 128 .. 255 range treated as
57UTF-8.
58
59The uri_escape() function takes an optional second argument that
60overrides the set of characters that are to be escaped. The set is
61specified as a string that can be used in a regular expression
62character class (between [ ]). E.g.:
63
64 "\x00-\x1f\x7f-\xff" # all control and hi-bit characters
65 "a-z" # all lower case characters
66 "^A-Za-z" # everything not a letter
67
68The default set of characters to be escaped is all those which are
69I<not> part of the C<unreserved> character class shown above as well
70as the reserved characters. I.e. the default is:
71
72 "^A-Za-z0-9\-\._~"
73
74The second argument can also be specified as a regular expression object:
75
76 qr/[^A-Za-z]/
77
78Any strings matched by this regular expression will have all of their
79characters escaped.
80
81=item uri_escape_utf8( $string )
82
83=item uri_escape_utf8( $string, $unsafe )
84
85Works like uri_escape(), but will encode chars as UTF-8 before
86escaping them. This makes this function able to deal with characters
87with code above 255 in $string. Note that chars in the 128 .. 255
88range will be escaped differently by this function compared to what
89uri_escape() would. For chars in the 0 .. 127 range there is no
90difference.
91
92Equivalent to:
93
94 utf8::encode($string);
95 my $uri = uri_escape($string);
96
97Note: JavaScript has a function called escape() that produces the
98sequence "%uXXXX" for chars in the 256 .. 65535 range. This function
99has really nothing to do with URI escaping but some folks got confused
100since it "does the right thing" in the 0 .. 255 range. Because of
101this you sometimes see "URIs" with these kind of escapes. The
102JavaScript encodeURIComponent() function is similar to uri_escape_utf8().
103
104=item uri_unescape($string,...)
105
106Returns a string with each %XX sequence replaced with the actual byte
107(octet).
108
109This does the same as:
110
111 $string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
112
113but does not modify the string in-place as this RE would. Using the
114uri_unescape() function instead of the RE might make the code look
115cleaner and is a few characters less to type.
116
117In a simple benchmark test I did,
118calling the function (instead of the inline RE above) if a few chars
119were unescaped was something like 40% slower, and something like 700% slower if none were. If
120you are going to unescape a lot of times it might be a good idea to
121inline the RE.
122
123If the uri_unescape() function is passed multiple strings, then each
124one is returned unescaped.
125
126=back
127
128The module can also export the C<%escapes> hash, which contains the
129mapping from all 256 bytes to the corresponding escape codes. Lookup
130in this hash is faster than evaluating C<sprintf("%%%02X", ord($byte))>
131each time.
132
133=head1 SEE ALSO
134
135L<URI>
136
137
138=head1 COPYRIGHT
139
140Copyright 1995-2004 Gisle Aas.
141
142This program is free software; you can redistribute it and/or modify
143it under the same terms as Perl itself.
144
145=cut
146
147347µs340µs
# spent 27µs (14+13) within URI::Escape::BEGIN@147 which was called: # once (14µs+13µs) by OpenAPI::Modern::BEGIN@25 at line 147
use Exporter 5.57 'import';
# spent 27µs making 1 call to URI::Escape::BEGIN@147 # spent 7µs making 1 call to Exporter::import # spent 6µs making 1 call to UNIVERSAL::VERSION
148our %escapes;
14911µsour @EXPORT = qw(uri_escape uri_unescape uri_escape_utf8);
15011µsour @EXPORT_OK = qw(%escapes);
15110sour $VERSION = '5.17';
152
1532228µs12µs
# spent 2µs within URI::Escape::BEGIN@153 which was called: # once (2µs+0s) by OpenAPI::Modern::BEGIN@25 at line 153
use Carp ();
# spent 2µs making 1 call to URI::Escape::BEGIN@153
154
155# Build a char->hex map
15611µsfor (0..255) {
157256228µs $escapes{chr($_)} = sprintf("%%%02X", $_);
158}
159
16011µsmy %subst; # compiled patterns
161
162110µs22µsmy %Unsafe = (
# spent 2µs making 2 calls to URI::Escape::CORE:qr, avg 1µs/call
163 RFC2732 => qr/[^A-Za-z0-9\-_.!~*'()]/,
164 RFC3986 => qr/[^A-Za-z0-9\-\._~]/,
165);
166
167sub uri_escape {
168 my($text, $patn) = @_;
169 return undef unless defined $text;
170 my $re;
171 if (defined $patn){
172 if (ref $patn eq 'Regexp') {
173 $text =~ s{($patn)}{
174 join('', map +($escapes{$_} || _fail_hi($_)), split //, "$1")
175 }ge;
176 return $text;
177 }
178 $re = $subst{$patn};
179 if (!defined $re) {
180 $re = $patn;
181 # we need to escape the [] characters, except for those used in
182 # posix classes. if they are prefixed by a backslash, allow them
183 # through unmodified.
184 $re =~ s{(\[:\w+:\])|(\\)?([\[\]]|\\\z)}{
185 defined $1 ? $1 : defined $2 ? "$2$3" : "\\$3"
186 }ge;
187 eval {
188 # disable the warnings here, since they will trigger later
189 # when used, and we only want them to appear once per call,
190 # but every time the same pattern is used.
1912260µs233µs
# spent 19µs (5+14) within URI::Escape::BEGIN@191 which was called: # once (5µs+14µs) by OpenAPI::Modern::BEGIN@25 at line 191
no warnings 'regexp';
# spent 19µs making 1 call to URI::Escape::BEGIN@191 # spent 14µs making 1 call to warnings::unimport
192 $re = $subst{$patn} = qr{[$re]};
193 1;
194 } or Carp::croak("uri_escape: $@");
195 }
196 }
197 else {
198 $re = $Unsafe{RFC3986};
199 }
200 $text =~ s/($re)/$escapes{$1} || _fail_hi($1)/ge;
201 $text;
202}
203
204sub _fail_hi {
205 my $chr = shift;
206 Carp::croak(sprintf "Can't escape \\x{%04X}, try uri_escape_utf8() instead", ord($chr));
207}
208
209sub uri_escape_utf8 {
210 my $text = shift;
211 return undef unless defined $text;
212 utf8::encode($text);
213 return uri_escape($text, @_);
214}
215
216sub uri_unescape {
217 # Note from RFC1630: "Sequences which start with a percent sign
218 # but are not followed by two hexadecimal characters are reserved
219 # for future extension"
220 my $str = shift;
221 if (@_ && wantarray) {
222 # not executed for the common case of a single argument
223 my @str = ($str, @_); # need to copy
224 for (@str) {
225 s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
226 }
227 return @str;
228 }
229 $str =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg if defined $str;
230 $str;
231}
232
233# XXX FIXME escape_char is buggy as it assigns meaning to the string's storage format.
234sub escape_char {
235 # Old versions of utf8::is_utf8() didn't properly handle magical vars (e.g. $1).
236 # The following forces a fetch to occur beforehand.
237 my $dummy = substr($_[0], 0, 0);
238
239 if (utf8::is_utf8($_[0])) {
240 my $s = shift;
241 utf8::encode($s);
242 unshift(@_, $s);
243 }
244
245 return join '', @URI::Escape::escapes{split //, $_[0]};
246}
247
24816µs1;
 
# spent 2µs within URI::Escape::CORE:qr which was called 2 times, avg 1µs/call: # 2 times (2µs+0s) by OpenAPI::Modern::BEGIN@25 at line 162, avg 1µs/call
sub URI::Escape::CORE:qr; # opcode