The URL stands for Uniform Resource Locator and has a unique address for each page of documents on the World Wide Web, all Gopher items, and all USENET discussion groups.
Uniform resource finders were a fundamental innovation in Internet history. It was first used by Tim Berners-Lee in 1991 to enable document authors to create hyperlinks on the World Wide Web.
Since 1994, in the Internet standards, the concept of URL has been included in the more general URI, but the term URL is still widely used.
Although they are never mentioned in any standard in this way, many believe that the first URLs mean a universal resource finder.
This interpretation may be due to the fact that the URI of the URI is universally expressed before the release of RFC 2396, although the U in the URL is always uniform.
The URL of an information source is the Internet address that allows the browser to find the source and display it accordingly.
Therefore, it combines the name of the computer that provides the information, the directory where it is located, the name of the file, and the protocol to be used to retrieve the data.
The general format: scheme://machine/directory/file
Other data can also be added: scheme://user:password@machine:port/directory/file
A URL is usually classified according to its scheme, which shows the network protocol used to get the information of the resource identified over the network. The URL begins with the schema name, followed by a colon and a specific section of the scheme.
Some Examples of URL Schemes
HTTP (HyperText Transfer Protocol) is the protocol used to transmit HyperText. All HTML pages on WWW servers should be referenced through this service. It indicates that there is a connection to the WWW Server.
HTTPS (HyperText Transfer Protocol Secure) is the protocol used to connect to secure WWW servers. These servers are normally commercially extensive and use encryption to prevent the capture of data sent, usually credit card numbers, personal data, and will connect to a secure WWW server.
FTP (File Transfer Protocol) will be used when the information to be accessed is on an FTP server. By default, an anonymous server is accessed (anonymous), if you want to specify the user name, it is used as follows: FTP: //user.password@machine, and then it will ask for the access key.
Mailto is used to send Emails, but not all browsers. In this case, only the target e-mail address is specified: mailto: //alias.mail@machinename
LDAP looks for the Lightweight Directory Access Protocol.
Telnet is used to access general accounts such as remote terminal emulation, library accounts, to connect to a multi-user machine. The normal thing is to call an external application that provides the connection. In this case, the machine and the entry will be specified.
Along with some of the popular schemes such as mailto, HTTP, FTP, and file, and the general URL syntax, the Request for Comments was first announced in 1994 in RFC 1630, then in more specific RFC 1738 and RFC 1808.
Some of the schemes described in the first RFC are still valid, others are discussed or refined by later standards.
Meanwhile, the definition of the generic URL syntax is divided into two separate URI specification lines: RFC 2396 (1998) and RFC 2732 (1999). The current standard is STD 66 and RFC 3986 (2005).
General URL Syntax
All URLs must follow a general syntax, regardless of the schema. Each schema can determine its own syntax requirements for its specific part, but the full URL should follow the general syntax.
Using a limited character set compatible with ASCII’s printable subset, the general syntax allows URLs to represent the address of a resource, regardless of the original format of the address components.
Schemes using typical link-based protocols use a common syntax for generic URIs: schema://authority/path?Query#part
Authorization usually includes the name or IP Address of a Server, sometimes followed by a colon (:) and a TCP Port number. You can also add a username and password to verify yourself on the server.
The path is a specification of a location in some hierarchical structure that uses a slash (/) as the delimiter between components.
The query usually specifies the parameters of a dynamic query for some database or processes built on the server.
A part identifies a portion of a resource, usually a location in a document.
According to the current standard, schema and host components are not case sensitive and must be lowercase when normalized during processing. It should be assumed that there is differentiation in other components.
However, in practice, in different components other than the protocol and the host, this differentiation depends on the webserver and the operating system of the host hosting the server.
URL in Daily Use
An HTTP URL combines four basic pieces of information needed to retrieve a resource from anywhere on the Internet at a simple address:
The protocol used to communicate.
The host (server) with which you communicate.
The network port on the server to connect to.
The path to the resource on the server (for example, the file name).
Most web browsers do not require the user to enter “http://” to go to the web page, as HTTP is the most common protocol used in web browsers.
Similarly, since 80 is the default port for HTTP, it is not usually specified.
Because the HTTP protocol allows a server to respond to a request by pointing the web browser to a different URL, many servers also allow users to bypass certain parts of the URL, such as www.
However, these shortcomings create a technically different address, so the web browser cannot make these adjustments and you have to rely on the server to respond with a redirect. It is possible for a web server to provide two different pages for URLs that differ only in a # character.