NOTE: I needed this information but this page has been eaten by spam for a long time, and was imported from the old wiki after the crash but never cleaned up. The best copy I could find with version 5. I took it into a text editor and cleaned it up.
Sometimes you may want to render a page from another source (perhaps an internal server, visible to your webserver but not to the internet at large), either as is or with some editing.
This is quite easy in Rails. On this page I’ll walk you through the development of a fairly complete proxy wrapper (which at the time of this writing is being used to HowToWrapRailsAroundAnExistingApplication). If your needs are simple, you may want to just grab one of the early examples, and stop reading as soon as we’ve gotten to the fuctionality you need. If your needs are urgent, you may want to just jump to the end and grab the finished product.
But if you want to understand how it all works, feel free to follow along as I “”http://wiki.rubyonrails.org/rails/pages/ProgrammingOutLoud" class="existingWikiWord">program out loud1".
—————————-
By using a combination of render_text (which actually renders HTML) and Ruby’s Net library’s Net::HTTP.get_response you can handle most pages in just a few lines of code:
app/controllers/default_controller.rb
class DefaultController < ApplicationController
def default
render_text Net::HTTP.get_response(
"realserver.internal.net",
request.env["REQUEST_URI"]
).body
end
end
Any HTTP GET request which is routed to the default action on the default controller will return the body of the response from \GETing the corresponding path (& query string, if any) from realserver.internal.net.
For this to work, all you need to do is route the URLs you want to pass through to realserver.internal.net into the default method on the default controller. If there are only one or two you can do it by hand; if there are lots it helps to know HowToRouteGenericURLsToAController
This simple setup is fine for some applications, but other times you’ll need more.
—————————-
One common possibility would be requests other than just GET. For example, forms frequently use HTTP POST requests, but the code above treats everything as a GET.
Luckily2, Net::HTTP objects respond to methods (get, post, put, and head) that are just the lower-case versions of the HTTP methods, so we can write:
app/controllers/default_controller.rb
class DefaultController < ApplicationController
def default
url = request.env["REQUEST_URI"]
method = request.env["REQUEST_METHOD"]
data = request.env["RAW_POST_DATA"]
render_text Net::HTTP.start('realserver.internal.net') { |x|
x.send(method.downcase,url,data).body
}
end
end
We also have to capture the data (for post and put), to pass along for the methods that need it.
—————————-
Not all web servers live on port 80—in fact, there are good reasons to put them elsewhere (see HowToMakeVirtualHostsPrivateUnderApache). This too is easy to cope with:
app/controllers/default_controller.rb
class DefaultController < ApplicationController
def default
url = request.env["REQUEST_URI"]
method = request.env["REQUEST_METHOD"]
data = request.env["RAW_POST_DATA"]
port = 80
render_text Net::HTTP.start('realserver.internal.net',port) { |x|
x.send(method.downcase,url,data).body
}
end
end
—————————-
Another common situation (especially once you start posting form data) is redirection. The response you get back from realserver.internal.net may not be the page you want to display; it may be an HTTP Redirect Response, telling you where to go (in a nice way).
There are several ways you could handle redirections; my favorite is to do it TheRubyWay—simply extend the HTTPResponse classes by adding a after_redirection method. For most HTTPResponses this will just be the original response for which we could write something like:
alpha version
class Net::HTTPResponse
def after_redirection
self
end
end
But for redirection responses it will be something like:
alpha version
class Net::HTTPRedirection
def after_redirection
Net::HTTP.get(self['location']).after_redirection
end
end
Then instead of giving render_text the response.body we would give it the response.after_redirection.body and all should be well.
That is, until we get a redirection from a ill-mannered server, and instead of sending an absolute URL, they send us a relative path.
sigh
This means that “after_redirection” needs to know the URI of the original request, and (after extending URI::HTTP to compute relative URIs) we wind up with this:
app/controllers/default_controller.rb
class URI::HTTP
def +(new_loc)
URI.parse(
case new_loc
when /^[\/]/ then "http://#{host}:#{port}#{new_loc}"
when /^.{3,6}:/ then new_loc
else "http://#{host}:#{port}#{path.sub(/\/?[^\/]*$/,'/'+new_loc)}"
end
)
end
end
class Net::HTTPResponse
def after_redirection_from(uri)
self
end
end
class Net::HTTPRedirection
def after_redirection_from(uri)
Net::HTTP.get_response(uri+self‘location’).after_redirection_from(uri)
end
end
class NukeController < ApplicationController
def default
path = request.env“REQUEST_URI”
method = request.env“REQUEST_METHOD”
data = request.env“RAW_POST_DATA”
host = “realserver.internal.net”
port = 80
uri = URI.parse(“http://#{host}:#{port}#{path}”)
response = Net::HTTP.start(host,port) { |x|
x.send(method.downcase,path,data)
}
render_text response.after_redirection_from(uri).body
end
end
The class extentions can be done anywhere (isn’t Ruby great!) but I’d typically move them to a helper called “extensions” (and if I really liked them, try to get them added to the base libraries).
I like the don’t really like the ‘+’ operator on URIs, but I don’t really like the “after-redirection’” method, mainly because it requires passing in the URI being redirected from. If the URI were avaialable in the response, or if people actually conformed to the standards and only sent absolute URIs I would be much happier with it.
—————————-
class A_cookie_jar
def parse_cookie_from(uri,s)
def cookies_for(uri)
end
Using it was simple too, though it required refactoring the solution for handling HTTP redirects. If you recall, I wasn’t happy with having to pass the URI into the redirect; passing around a jar full of cookies was too much.
So I pulled it into the dispatcher as a private method called fetch, which takes care of everything. It uses a method called cookie_jar to access A_cookie_jar stored in the session (or create one if needed).
Line by line:
Here’s the code:
def cookie_jar
@session[:cookie_jar] ||= A_cookie_jar.new
end
def fetch(method,uri,data)
headers = {'Cookie' => cookie_jar.cookies_for(uri).to_s }
args = (method == 'POST' || method == 'PUT') ? [ data, headers ] : [ headers ]
response = Net::HTTP.start(uri.host,uri.port) { |x| x.send(method.downcase,[uri.path,uri.query].join('?'),*args) }
cookie_jar.parse_cookie_from(uri,response['set-cookie'])
response = fetch('GET',uri + response['location'],data) if response.is_a? Net::HTTPRedirection
response
end
The default handler now looks like this:
def default
response = fetch(
request.env["REQUEST_METHOD"],
URI.parse("http://realserver.internal.net:81#{request.env["REQUEST_URI"])}"),
request.env[ "RAW_POST_DATA" ]
)
render_text response.body
end
—————————-
Sometimes the backend machine will return an error code (something other than “200 OK”). You can of course handle these youself, as we did with redirection, but any that you don’t should probably be passed through to the user. This trivial in Rails, requiering only a small change to the render_text statement:
app/controllers/default_controller.rb
def default
response = fetch(
request.env["REQUEST_METHOD"],
URI.parse("http://realserver.internal.net:81#{request.env["REQUEST_URI"])}"),
request.env[ "RAW_POST_DATA" ]
)
render_text response.body,"#{response.code} #{response.message}"
end
—————————-
We are now well positioned to filter both the incomming requests (before we fetch the results) and the responses to them (after we fetch them but before we render them). I’ll take up filtering the results in the next section, and here focus on some simple sanity checking of the requests. You could of course do something more elaborate if needed.
The basic idea is this: we know what the user requested (url, QueryString and PostData). If it seems like a reasonable request, we can give it to them; if not, we brush them off with an error:
app/controllers/default_controller.rb
def default
uri = URI.parse("http://realserver.internal.net:81#{request.env["REQUEST_URI"])}"),
data = request.env[ "RAW_POST_DATA" ]
if reasonable_request(uri,data)
response = fetch(request.env["REQUEST_METHOD"],uri,data)
render_text response.body,"#{response.code} #{response.message}"
else
render "bad_request","400 Bad Request"
end
end
def reasonable_request(uri,data)
case uri
when /account_lookup.cgi?acct=(\d+)/ then true
when /deposit.cgi?acct=(\d+)&amt=(\d+[.]\d\d)/ then true
else false
end
end
—————————-
My main motivation for filtering the proxied pages was to replace the UglyURLs (which revealed far too much about the inner workings of the site) with PrettyURLs. The first stab at this was simply to remove direct references to the CGI scripts in the URLs, and pull the field names and LineNoise out of the query string. Again, you could do something more elaborate if you needed to.
app/controllers/default_controller.rb
class NukeController < ApplicationController
after_filter :make_pretty_urls
:
def make_pretty_urls
@response.body.gsub!(/modules\.php\?name=([a-z_]+)((&[a-z]+=[a-z0-9]*)*)/i) { |m|
module_name = $1
parameters = ($2||'')
case module_name
when /your_account/i then module_name = 'Account'
end
module_name+parameters.gsub(/&[a-z]+=/i,'/')
}
end
end
—————————-
You may want to cache images locally…
1 And since this is a Wiki, you can even interrupt to ask questions!
2 The whole goal of good engineering is to increase the chance that you will be “lucky” at any given point in the future.
category: Howto