Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

PROGRAMMING CHALLENGE

Name: Anonymous 2010-07-16 18:57


This one should be difficult enough for you guys,

Write a function foo that takes a number n and returns a function that takes a number i, and returns n incremented by i.


My submission, in Scheme


(define (foo num)
  (lambda (x) (+ x num)))

Name: Anonymous 2010-07-18 5:39

>>78
read >>1 again. there is no requirement that precludes returning the same function every time.

Name: Anonymous 2010-07-18 5:44

: foo ( n -- quot ) '[ _ + ] ;

Name: Anonymous 2010-07-18 5:50

>>77
You still have to manage the memory containing the closure though
That's easy, just jmp somewhere else and free the memory, then ret.

Name: Anonymous 2010-07-18 5:52

>>81
I would think best practices would have it covered. If you don't think so too you're fired.

Name: 2,4 2010-07-18 19:09

1) would work, except it's more complicated than you make it seem; for instance malloc'd memory is execution-protected on modern operating systems. Honestly to make it portable, your best bet is probably to compile it with llvm. You still have to manage the memory containing the closure though, so it's still not like a closure in a typical HLL.
If malloc isn't enough, you can always use the system's page memory allocator (for example VirtualAlloc on Win32), and if the system respects the no-execute bit or allows read/write/execute memory protection on pages, just make a call to change the memory protection (for example VirtualProtect on Win32). Of course, there are plenty other ways to do this, but on the lower level, it will always involve allocing a piece of memory which can contain executable code. The real problem here is generating the right code which points to the right (closure data) offsets (apply relocations if needed or generate base indepdent code).

2) does not work because your funcall macro can't be passed around as a function pointer. there's no real way to do it other than return a struct (the closing environment) and pass in that struct in the function call.
You'd pass a "function object"(encapsulates a function and the closure object) to the funcall macro. The macro itself calls the function field of the structure with the closure object and the rest of the arguments passed. It can also be done as a function. While this means that you're passing structs around instead of function pointers, and the syntax is a bit different, the functionality is achieved, except it's not nearly as comfortable to use as normal functions.

Closures are probably a pain to use in languages which don't support them natively, maybe to the point you might not want to bother with them.

>>83
How do we know when the closure object will be freed? Most code is usually freed by the OS when the application exits, however that's probably not a good idea for closures as they can grow in number. Given C's approach of manual memory management, you'd probably make some closure_free function which performs the cleanup and call it whenever you don't need the closure anymore.

Name: Anonymous 2010-07-18 20:05

Object pascal:


type
 TNumFunc = function(i: PtrInt): PtrInt of object;
 
 TNumHax = class
  function NumFunk(i: PtrInt): PtrInt;
 end;

function TNumHax.NumFunk(i: PtrInt): PtrInt;
begin
   result := PtrInt(self)+i;
end;

function foo(n: PtrInt): TNumFunc;
begin
   TMethod(result).Code := @TNumHax.NumFunk;
   TMethod(result).Data := Pointer(n);
end;

Name: Anonymous 2010-07-18 21:11

-- I love Haskell
f = (+)

Name: Anonymous 2010-07-18 21:36

I have to say, I'm disappointed at the severe lack of ridiculous implementations.

Name: Anonymous 2010-07-18 21:36

>>85
Yes but your macro can't be used as a function pointer, that's the whole point. You can't pass it around as an int(*)(int). You're not calling the return value of the generator as a function; you're calling some other macro, passing in the return value of the generator.

This is why you can almost accomplish this in C++ by creating an object containing the closure values and overriding operator(). You can't store it in a regular function pointer, but you can still call it as a regular function.

Name: Anonymous 2010-07-18 21:42

>>89
You can't store it in a regular function pointer, but you can still call it as a regular function.
That's similar to >>21 in that you have to call() the returned block, you can't just invoke it like a method.

Name: Anonymous 2010-07-18 21:43

>>88
Don't make me bust out the.. uh.. Tcl.

Name: Anonymous 2010-07-18 22:23

>>90
Actually, it should be possible to generate a small function which calls a specific closure, and you only have to specify the closure params to it. Easy with a bit of asm, but not very portable. That's the problem with languages which can't generate code at runtime and can't assign symbols with code at runtime. Much easier done in Lisp.

Name: Anonymous 2010-07-18 22:23

>>90
Actually, it should be possible to generate a small function which calls a specific closure, and you only have to specify the closure params to it. Easy with a bit of asm, but not very portable. That's the problem with languages which can't generate code at runtime and can't assign symbols with code at runtime. Much easier done in Lisp.

Name: Anonymous 2010-07-18 23:20

>>92
>>21 could be 'fixed', but it wouldn't normally be worth the extra code. Sending a call message is not a serious problem. Fixing the C++ example is certainly not worth it, and any attempts to do so trade on the purity of the concept anyway.

Name: Anonymous 2010-07-18 23:37

I use closures in C all the time.  Also, typedefs are for pussies.


#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int (*mkadder(int n))(int i)
{
    static const unsigned char TEXT[13] = {
        0x55, 0x89, 0xe5, 0x8b, 0x45, 0x08, 0x5d, 0x05, 0, 0, 0, 0, 0xc3
    };
    unsigned char *p = malloc(13);
    if (!p) return NULL;
    memcpy(p, TEXT, 13);
    memcpy(p + 8, &n, 4);
    return (int (*)(int))p;
}

int main(int argc, char *argv[])
{
    int (*f)(int), (*g)(int);
    f = mkadder(10);
    g = mkadder(100);
    printf("f(5) = %i\n", f(5));
    printf("g(16) = %i\n", g(16));
    return 0;
}

Name: Anonymous 2010-07-18 23:40


def foo(n):
    return lambda i:n+i

Name: Anonymous 2010-07-18 23:46

>>69

You started with Haskell just a few days ago?  That much is obvious, no Haskeller with more than a week's experience would write such verbose code.


foo :: (Num a) => a -> a -> a
foo = (+)

Name: Anonymous 2010-07-19 0:01

>>97
verbose code.
foo :: (Num a) => a -> a -> a

HIBT?

Name: Anonymous 2010-07-19 2:31

>>95
You just bumped into my NX bit

Name: Anonymous 2010-07-19 3:04

>>98
No, you are just an idiot.

Name: Anonymous 2010-07-19 3:15

>>95
I love you.
>>99
Fuck your shit.

Name: Anonymous 2010-07-19 3:22

I have NX bit turned by default and immutable.
Any "clever" program trying to write/execute to memory(like some Opera versions) gets uninstalled immediately.
Such hacks have no place in modern enterprise environment.

Name: Anonymous 2010-07-19 3:25

>>102
Java and any JIT/VM/Intepreter engines requires executing memory. Are you saying your enterprise doesn't work with Java EE?

Name: Anonymous 2010-07-19 3:28

>>103
His "ENTERPRISE" is his basement, and it involves a huge wall poster of Captain Picard and his lackeys.

Name: Anonymous 2010-07-19 3:33

>>103
Since only idiots use Java, why are you even mentioning it?
Are you an idiot?

Name: Anonymous 2010-07-19 3:42

>>103

I'm going to go ahead and agree with #103, even for non-JIT interpreted environments.  The only difference is that non-JIT interpreted environments aren't affected by the NX bit, because the NX bit can only stop regions of memory from being executed by the processor itself.  The same types of attacks that work on buffer overflows in native code foiled by the NX bit would succeed against the same flaws in interpreted environments.

So that means that #102 should, by their own admission, remove any copies of Perl / Python / Ruby / Java / Javascript / Lisp / Scheme / Haskell / Bash / Sh / Zsh / Ksh / Apple BASIC from their systems right now.  Doubly so for Java / Python "Unladen Swallow" / Lua-JIT / half of Lisp implementations ever.

It's not "clever", it's the whole point of the Von Neumann architecture.  If people thought we didn't need to execute main memory, we'd switch to Harvard architecture in a heartbeat -- processor performance certainly increases.  But no, people only use Harvard architecture for dumb, fast things like DSP microcontrollers.

Enforcing the NX bit on all your programs' heaps is like trying to reduce home burglaries by nailing everyone's windows shut.

Name: Anonymous 2010-07-19 3:52

Program A creates binary file B and saves it to disk/ram
B executes/spawns as new thread/process/dll.
NX bit just stands there being useless.
Where is your Neumann now?

Name: Anonymous 2010-07-19 4:00

>>102
That's silly and there are countless ways to bypass it. It's also harmful because it disallows dynamic code generation done by a lot of modern language implementations.

Also what >>106 said:
So that means that #102 should, by their own admission, remove any copies of Perl / Python / Ruby / Java / Javascript / Lisp / Scheme / Haskell / Bash / Sh / Zsh / Ksh / Apple BASIC from their systems right now.  Doubly so for Java / Python "Unladen Swallow" / Lua-JIT / half of Lisp implementations ever.
Enjoy your C-only setup >>102.

Did I also mention that on some platforms, executable packers/protections are prevalent and a lot of applications write code then execute it. Are you telling me you refuse to run UPX'ed apps?

Name: Anonymous 2010-07-19 4:17

<?php

// 5.3 -->
function foo_closure($n)
{
    return function($i) use ($n) {
         return $n + $i;
    };
}

// --> 5.2
function foo_createfun($n)
{
    return create_function('$i', 'return $i + ' . $n . ';');
}

Name: Anonymous 2010-07-19 4:26

On that note, here's a helpful comment from the manual. Topmost entry. Luckily I happened to glance across it.

To avoid memory problems, when creating a dynamic function within loops, register it as a global variable, and check if it has already been set;

<?php
global $my_func;
if (!isset($my_func))
{
    $my_func = create_function($args, $code);
}

$my_func();
?>

That way you don't end up creating the same function twice (or a couple of million of times, as it happened to me...)

Name: Anonymous 2010-07-19 4:48

Here's an "improved" C version, guaranteed to be portable across different Linux architectures.  It's also very easy to change the function to perform arbitrary computations.  With minimal modification it can also run on OS X.  No worries about the NX bit, either.  It does leak memory, though... like all good C programs should.


#include <stdio.h>
#include <dlfcn.h>
#include <stdlib.h>
#include <err.h>
#include <unistd.h>

int (*mkadder(int n))(int i)
{
    FILE *tmp;
    void *lib, *func;
    int r;
    char nmc[20], nml[20], cmd[40];
    snprintf(nmc, sizeof(nmc), "./tmp%i.c", n);
    snprintf(nml, sizeof(nml), "./tmp%i.so", n);
    snprintf(cmd, sizeof(cmd), "gcc %s -shared -o %s", nmc, nml);
    if (!(tmp = fopen(nmc, "w"))) err(1, "fopen");
    fprintf(tmp, "int func(int i) { return i + %i; }\n", n);
    fclose(tmp);
    if ((r = system(cmd))) errx(1, "gcc");
    unlink(nmc);
    if (!(lib = dlopen(nml, RTLD_LAZY | RTLD_LOCAL)))
        errx(1, "dlopen: %s", dlerror());
    if (!(func = dlsym(lib, "func")))
        errx(1, "dlsym: %s", dlerror());
    unlink(nml);
    return func;
}

int main(int argc, char *argv[])
{
    int (*f)(int), (*g)(int);
    f = mkadder(10);
    g = mkadder(100);
    printf("f(5) = %i\n", f(5));
    printf("g(16) = %i\n", g(16));
    return 0;
}

Name: Anonymous 2010-07-19 4:58

>>111
gcc: not found

Name: Anonymous 2010-07-19 5:03

>>111
My!

Name: Anonymous 2010-07-19 5:20

* A new warning has been added to GCC: "-Wtrampolines".  This issues a warning message whenever gcc generates a trampoline, which is a small piece of code that is needed whenever the address of a nested function is taken.  Since this piece of code is created at run time on the stack, it requires an executable stack in order work.  This might be a problem for the target execution environment.

http://nickclifton.livejournal.com/6143.html

Name: Anonymous 2010-07-19 7:12

While I'm sure most of us love entertaining dynamic code generation, in reality, very few language implementations recompile the whole function to generate a closure (I haven't actually see any that do), instead, what they do is compile the closed over functions, and the closure state is referenced either through the stack(arguments) or through some other hidden way to pass arguments. It's also quite common to pass a reference instead of the literate value, this allows sharing the value between functions/closures. Closure objects themselves are very lightweight and only contain: tag bits (or some other way to identify them), a reference(pointer) to the function object (usually tagged as well), the closure data(array or structure(which are not that different from arrays)). Closures themselves are callable objects, just like functions (those functions objects) or other data types which can be callable (for example, funcallable instances(of classes) exist in CL implementations with MOP support, they're functionally equivalent to closures, but allow more fine-grained control)). Calling such an object would lead to either have the compiler generate the needed calling code by itself (if it's possible to determine at compile time), or call a dispatcher which performs the call at runtime depending on the type.

It should also be noticed that some of the closure implementations in this thread use literals, so they do not allow sharing state, here's a simple example which illustrates this as well as shows how the closure internals work:


(let ((a 0))
  (defun set-a (n)
    (setf a n))
  (defun get-a ()
    a))

CL-USER> (get-a)
0
CL-USER> (set-a 100)
100
CL-USER> (get-a)
100

and some internals:
#'get-a:

#<FUNCTION {24E2D435}>
--------------------
FUNCTION: #<FUNCTION GET-A {24DE7E05}>
Closed over values:
0: #<value cell 100 {24DE7C4F}>

#'set-a:

#<FUNCTION {24DE7C3D}>
--------------------
FUNCTION: #<FUNCTION SET-A {24DE7E3D}>
Closed over values:
0: #<value cell 100 {24DE7C4F}> <-- same pointer!

the value cell a:

#<SB-KERNEL::RANDOM-CLASS {24DE7C4F}>
--------------------
VALUE: 100

(In this case, the compiler actually compiles both functions at once, so they actually point to different offsets within the same code block, but that'd be too much detail to show here...)

Also keep in mind that while this used functions, closures can be generated at runtime quite easily and cheaply (just return a lambda), so whatever closure implementation your language supports needs to allow this as well (without the previous requirement, you can just use globals to achieve the same).

So the points I'm trying to make is that real closures:
1) Allow their value to be shared by other closures or pieces of code. They can close over anything in the lexical scope. Sharing values also means that if the language allows mutation of variables, this should also be supported.
2) Can be implemented fairly cheaply(in terms of speed and space cost) without the need of runtime code generation.

Sadly, I don't see how you can achieve these goals in ANSI C portably without adding your own function call macro, which would make everything quite clunky, also memory management would be a pain.

Name: Anonymous 2010-07-19 8:14

>>115

Most of the people who would do this kind of thing in C are people who are implementing runtime libraries for higher level languages, and they don't care about portability because they'd need to rewrite the code generator if they switched architectures anyway.

To summarize the reasonable options presented in #115, implementors have three reasonable options for supporting closures:

1. Use globals and only allow one instance of a given closure to be active at any given time.  This works surprisingly well on the smallest of embedded systems.  Some environments for small embedded systems don't support recursive functions.  By "small", I mean no more than 1K RAM.
2. Change the calling convention for functions, add a "hidden" parameter.
3. Dynamically generate a piece of code to add a hidden parameter to the function call and then jump to the main function.

I'm making a case for #3 as being pretty reasonable.  Runtime code generation sounds like a beast, but on most architectures you can write one piece of glue code that works for all closures.  It would only require twenty bytes or so, and for most runtimes you'd need to make an allocation for technique #2 anyway so an extra twenty bytes is pretty cheap.  In many cases a sophisticated complier would inline the glue code (if its value is known) which effectively turns it into case #2.  (This glue code is what GCC calls a "trampoline".)  The performance question boils down to which you do more often: pass around closures or ordinary function pointers.  My guess is that:


; Assuming mapcar isn't inlined.  This assumption is almost certainly wrong.
(mapcar #'car my-list) ; faster with option #3, probably
(mapcar #'(lambda (x) (* x n)) my-list) ; faster with option #2, probably
; Do you even know which of the above two cases is more common for non-inlined
; higher order functions in your code base?  I bet you have no clue.
; So the performance argument is kinda pointless.


As for C, these days you can use blocks if you're okay with targeting clang.  That's not unreasonable, people writing Lisp code are fairly cavalier about portability and most Haskellers you meet don't care about any compiler other than GHC.

Name: Anonymous 2010-07-19 9:08

>>111
To optimize speed just write it to a RAM drive.

Name: Anonymous 2010-07-19 9:44

>>117

That shouldn't result in much of a speed difference, if any.  Do you even know what the page cache is?

Name: Anonymous 2010-07-19 11:17

>>118
You don't appear to.

Name: Anonymous 2010-07-19 11:43

>That shouldn't result in much of a speed difference, if any.
Windows designer detected.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List